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CHAPTER  I  -  COMPUTER  ORGANIZATIONS 

I-l  Organization  and  Command  Structure  for  Content- 

Addressable  Memory  Systems 

1.  1.  1  Introduction 

This  report  describes  some  methods  for  parallel  processing  of  data 
through  use  of  a  content-addressable  memory  (CAM).  Processing  is  parallel 
in  the  sense  that  all  words  in  CAM  or  any  designated  subset  of  these  may  be 
processed  in  a  single  operation.  Allowable  data  processing  operations  and 
allowable  arguments  for  these  are  dependent  on  the  logical  and  physical 
properties  of  CAM.  The  significance  of  these  data  processing  methods  will 
be  evaluated  relative  to  both  hardware  requirements  and  execution  time. 

The  objectives  of  this  report  are: 

1)  Define  the  class  of  data  processing  operations  which  are 
feasible  in  CAM  and  which  may  make  use  of  the  parallelism 
of  CAM. 

2)  Describe  some  Variations  in  logical  organization  of  CAM. 
Describe  the  data  processing  operations  which  are  made 
more  or  less  effective  by  each  variation.  For  a  selected 
CAM  organization,  present  an  extendable  command  set  and ' 
a  control  organization  to  permit  mechanization  of  the  set. 

3)  Define  problem  characteristics  which  generally  guarantee  . 
that  solution  time  will  be  decreased  by  use  of  CAM. 

The  CAM  memory  system  was  initially  proposed  by  Slade,  et  al^ 
to  allow  retrieval  of  stored  data  by  reference  to  content  of  a  cell  rather  than 
by  its  physical  location.  Cell  locations  or  contents  are  retrieved,  if  a  specified 
portion  of  their  content  equals  a  key  word.  The  required  equality  search  is 
executed  in  parallel  for  all  cells. 

In  conventional  systems,  data  is  stored  in  a  randorn  access  memory 
as  a  one -dimensional  array.  For  data  not  initially  in  this  form  a  single  index 
mark  may  be  assigned  to  each  memory  element  according  to  a  "memory 
mapping  function"  which  is  formed  on  reference  properties  and  assumes 
positive  integral  values.  For  example  a  three-dimensional  array  with  indices 


i  =  0(1)  (N-1),  j  =  0(1)  (N-l),  k  =  0(1)  (N-1)  may  be  converted  to  a  one-dimen- 

O 

sional  array  with  index  1  =  0(1)  (N  -  1)  by  the  mapping  function 
1  =  i  +  Nj  +  N^k 

Data  having  given  reference  properties  are  assigned  a  storage  location  deter¬ 
mined  from  the  mapping  function  evaluated  for  these  properties.  Data  may  be 
retrieved  by  location  addressing  with  knowledge  of  the  reference  properties, 
the  mapping  function  and  the  correspondence  of  mapping  function  values  to 
storage  locations. 

Location  addressing  becomes  cumbersome  and  content-addressing 
efficient  under  conditions  described  below: 

1)  Data  is  to  be  addressed  by  several  sets  of  reference  properties. 

If  location  addressing  is  used,  several  mapping  functions  must 
be  defined  and  the  data  item  or  reference  to  it  must  be  stored 
with  a  multiplicity  equal  to  the  number  of  possible  sets  of 
reference  properties.  If  this  number  is  unknown,  location 
addressing  is  unusable  without  observation  of  every  piece  of 
stored  data  with  possible  reordering  and  relocation.  An  example 
would  be  a  table  of  function  values  which  is  referenced  by  one  or 
more  function  values  as  well  as  by  one  or  more  arguments. 

2)  Data  elements  are  sparse  relative  to  values  of  the  reference 
property.  Location  addressing  is  often  forced  to  assign  cells 
for  which  no  data  are  stored.  An  example  would  be  a  sparse 
matrix,  where  each  element  is  stored  according  to  a  mapping 
function  on  its  matrix  indices  with  the  resulting  assignment  of 
a  separate  cell  to  each  null  element. 

3)  Data  become  dynamically  disordered  in  memory  during  processing. 
If  it  becomes  necessary  during  processing  to  refer  to  data  by 
some  property,  reordering  and  relocation  will  be  necessary 
using  location  addressing. 

Other  uses  for  CAM  will  become  evident  as  its  properties  are  detailed. 

It  is  evident  that  the  memory  mapping  function  need  not  be  single  valued  for  CAM; 


hence  that  multiple -membered  sets  of  CAM  cells  may  be  defined  and  addressed. 
In  the  next  section  some  possible  set  operations  for  CAM  are  considered. 


1.  1.  2  Set  Operations  in  CAM 

Each  word  cell  in  CA^  may  be  defined  by  content  of  the  "extended  cell" 
as  a  member  of  arbitrary  sets.  S.  The  extended  cell  includes  bit  positions  in 
the  memory  matrix  and  in  any  external  but  associated  storage  elements.  Any 
data  processing  operation  in  CAM  may  be  considered  as  a  set  operation 
executed  over  sets,  S.  A  useful  list  of  set  operations  executable  in  CAM  is 
presented  below.  The  particular  command  structure  presented  for  the  CAM 
system  described  later  in  this  chapter  synthesizes  many  of  these  operations. 

It  is  assumed  that  CAM  exists  as  a  subsystem  for  a  controlling  com¬ 
puter.  The  computer  can  present  some  "key"  or  reference,  word  to  CAM, 
generate  arbitrary  functions  of  this  key,  present  some  "tag"  word  consisting 
of  "1'  s",  "O'  s",  and  "don' t  care's"  to  define  set  marks,  call  for  data  stored 
in  CAM,  present  data  to  CAM  for  storage  and  execute  various  tests  defined 
below, 

1)  Mark  a  set  S.  Mark  the  set  of  CAM  cells  having  some  element 
of  their  Boolean  character  in  common.  A  mark  exists  as 
unique  states  for  specified  bit  positions  of  the  extended  cell. 
Boolean  character  of  a  cell  is  defined  as  some  Boolean  func¬ 
tion  of  cell  contents  and  of  contents  of  a  reference  or  "key" 
word.  Cells  of  a  given  Boolean  character  may  be  identified 
and  marked  if  a  set  of  primitive  Boolean  operators  is  mechan¬ 
ized  at  each  word  cell  in  CAM . 

The  following  operations  may  be  executed  over  marked  seta. 

2)  Mark  the  sum  or  union  of  sets  S2,  and  83  in  CAM.  Mark  the  set 
of  CAM  cells  which  belong  to  at  least  one  of  the  sets  and  S2. 

3)  Mark  the  intersection  of  sets  Si  and  S2  in  CAM.  Mark  the  set 
of  CAM  cells  belonging  to  both  and  S2. 

4)  Mark  the  difference  of  sets,  minus  S2.  Mark  the  set  of  CAM 
cells  contained  in  Si  but  not  contained  in  Sg. 
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5)  Replace  specified  bits  for  cells  in  some  set,  S.  For  example; 

a)  Store  data  into  a  sir,>'le  cell  having  a  given  mark. 

The  mark  may  be  changed  in  this  process. 

b)  Store  separate  data  items  into  each  cell  having  a 
given  mark. 

c)  Replace  specified  data  or  mark  bits  with  defined  bits 

for  each  cell  having  a  given  mark.  For  example,  void 

.  a  set  by  erasing  a  set  mark. 

__  1 

d)  Store  data  at  some  externally  specified  location, 

6)  Retrieve  set  members  stored  in  CAM.  For  example; 

a)  Read  contents  of  a  single  cell  having  a  given  mark. 

b)  Read  contents  of  all  cells  having  a  given  mark. 

c)  Read  contents  of  a  cell  at  some  externally  specified 

location. 

7)  Test  for  the  number  of  members  in  a. given  set. 

a)  Test  if  a  set  is  void.  Does  any  CAM  cell  have  a  given 
mark? 

b)  Test  if  a  set  has  more  than  a  single  member.  Does 
more  than  one  CAM  cell  have  a  given  mark? 

c)  Test  if  a  set  equals  the  set  of  all  CAM  cells.  Do  all 
CAM  cells  have  a  given  mark? 

d)  Test  if  a  set  lacks  more  than  a  single  member  to 

equal  the  set  of  all  CAM  cells.  Does  more  than  one  CAM 
cell  lack  a  single  mark? 

If  contents  of  CAM  cells  are  interpreted  as  numeric  quantities,  the 
following  set  operations  may  be  defined  and  executed; 

8)  Locate  the  member  or  members  of  some  input  set  having  the 
largest  (smallest)  signed  or  absolute  value  in  some  specified 
field  of  bit  positions.  Members  of  the  input  set  having  this 
property  are  marked  as  an  output  set. 
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V,  9)  Locate  the  member  or  members  of  some  input  set  having  a 

value  for  some  specified  bit  field  which  is  >;>,<,£,  =,  ^  a  key. 
Members  of  the  input  set  having  this  property  are  marked  as 
an  output  set. 

10)  Execute  arithmetic  opentfisns  for  sets  of  number  pairs  where 
each  pair  included  in  the  set  may  be  defined  in  the  following 
ways: 

a)  One  member  of  a  pair  is  an  externally  derived  key 
common  to  all  pairs;  the  second  member  is  stored 
in  a  bit  field  for  each  CAM  cell. 

b)  Pairs  are  stored  in  two  bit  fields  within  a  single 
CAM  cell. 

c)  ,  A  bit  field  is  specified  for  each  cell  in  each  of  two  sets. 

To  each  field  in  one  set  there  corresponds  a  field  in  the 
^  other  set  and  conversely.  Pairs  are  stored  in  corre¬ 

sponding  fields. 

Examples  of  arithmetic  operations  are: 

* 

a)  Form  the  siun  of  a  pair. 

b)  Form  the  product  of  a  pair. 

The  CAM  system  requirements  for  mechanizing  these  set  operations  are 
discussed  in  the  following  section. 

1.  1.  3  Organization  of  CAM  Systems 

A  word  cell  in  CAM  may  be  referenced  by  some  address  operation  (a 
Boolean  function  of  cell  contents  and  of  bits  in  some  key  word)  rather  than  (or 
in  addition  to)  reference  by  cell  location.  One  or  more  operators  are  mechan¬ 
ized  at  each  CAM  cell  and  may  be  evaluated  simultaneously  for  all  cells. 

Contents  of  cells  are  not  destroyed  in  evaluation  of  an  address  operator.  Results 
of  evaluation  are  stored  in  a  detector. 

Typical  address  operators  are  the  compare  operators  (>,>.<,<.  >,  #) 
operating  on  signed  numbers  or  magnitudes.  One  or  more  compare  operators 
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Could  be  mechanized  at  each  cell  and  would  compare  content  of  that  cell  to  ati 
externally  presented  key  in  a  sense  defined  by  the  operator.  The  operator  has 
the  value  1  if  the  defined  comparison  is  satisfied,  and  0  if  it  is  not. 

Functional  blocks  for  a  possible  CAM  system  are  illustrated  in  Figure  1.1. 

The  mask  register,  if  present,  is  1  in  those  bit  positions  for  which  con¬ 
tents  of  the  key  register  and  memory  matrix  serve  as  arguments  for  an  address 
operator  and  is  0  elsewhere.  Bit  positions  for  which  the  mask  register  is  0  are 
said  to  be  "masked".  Contents  of  the  mask  register  and  of  the  key  register  may 
be  arbitrarily  specified  under  program  control. 

The  exchange  register  required  to  write  data  into  memory  and  read  data 
from  memory  may  be  a  unique  register  or  this  function  may  be  served  by  the 
key  register. 

The  detector  may  contain  a  single  storage  element  which  is  set  true 
when  an  address  operator  is  satisfied  for  some  word  within  memory.  The 
equality  operator  would  generaUy  be  mechanized  for  this  simple  type  of  detector. 
Addresses  for  words  satisfying  the  address  operator  are  not  available.  Match 
type  indication  (i.e. ,  indication  of  a  multiplicity  of  words  satisfying  the  address 
operator)  may  or  may  not  be  available.  Alternatively  the  detector  may  be, a 
plane  containing  a  mark  storage  element  for  each  word  cell  in  memory.  Any 
address  operator  may  then  be  mechanized.  However,  in  this  instance,  equality 
comparison  with  summation  of  successive  marked  sets  in  the  detector  plane  is 
a  primitive  which,  repeatedly  applied,  allows  synthesis  of  any  address  operator. 
An  equality  comparison  is  defined  to  leave  detector  elements  true  for  word 
cells  which  match  a  masked  key. 

Address  operators  and  detector  elements  may  be  desired  with  high 
multiplicity;  hence  it  is  desirable  that  their  logical  properties  be  formulated 
so  as  to  imply  inexpensive  mechanization.  The  logical  network  which  mechanizes 
each  address  operator  is  assumed  to  be  quiescently  in  its  false  state.  The  true 
state  is  entered  transiently  if  the  address  operator  is  satisfied  during  an  evalua¬ 
tion.  The  state  of  a  detector  element  may  be  altered  by  a  true  address  operator 
and  is  not  altered  by  a  false  operator. 
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An  equality  comparison  may  be  mechanized  through  use  of  either  the 
equality  or  the  inequality  address  operator.  If  the  equality  operator  is  used,  . 
detector  elements  are  initially  reset  and  are  then  set  by  a  true  operator.  If 
the  inequality  operator  is  used,  detector  elements  are  initially  set  and  are 
then  reset  by  a  true  operator.  Successive  evaluations  of  the  equality  operator, 
with  no  intervening  reset  Of  the  detector,  leave  a  detector  element  true  if 
contents  of  its  word  cell  match  at  least  one  masked  key.  The  resulting  set  is 
the  sum  of  sets  matching  individual  keys.  Successive  evaluations  of  the 
inequality  operator,  with  no  intervening  set  of  the  detector,  leave  a  detector 
element  true  if  contents  of  its  word  cell  match  all  masked  keys.  The  resulting 
set  is  the  intersection  of  sets  matching  individual  keys.  For  logical  properties 
described  above,  the  equality  operator  must  be  evaluated  parallel-by-bit 
(i.e.  for  all  the  bits  in  a  masked  key).  The  inequality  operator  may  be  evaluated 
either  serial-  or  parallel-by-bit. 

To  read  from  or  write  into  a  cell  marked  by  a  detector  plane,  some 
link  (Figure  1,  1)  is  required  between  the  detector  and  the  read-write  address 
selector.  If  many  cells  are  marked  by  the  detector,  a  word  commutator  is 
required  in  this  link  to  select  a  single  cell,  or  to  sequentially  select  all  cells, 
for  reading  or  writing. 

If  linear  rather  than  coincident  address  selection  is  employed,  it  is 
further  possible  to  replace,  in  parallel,  specified  bits  in  detector-marked  cells 
by  defined  bits.  In  particular,  control  bits  may  be  stored  within  the  memory 
matrix.  Control  bits  mark  sets  as  do  elements  of  the  detector  plane  but  do  not 
link  the  address  selector.  They  may  be  cleared  to  "0"  throughout  CAM  and 
may  be  set  to  "l"  or  "0"  in  detector -marked  cells. 

Having  at  least  one  control  bit  in  addition  to  a  detector  element  for  each 
CAM  cell,  (the  bit  may  be  stored  internal  or  external  to  the  memory  matrix) 
and  using  either  the  equality  or  inequality  address  operator,  it  is  possible  to 
mark  the  complement  set  of  the  detector-marked  set.  The  control  bit  is 
written  to  "1"  in  detector -marked  cells  and  subsequently  interrogated  for  "0",  It 
is  possible  to  mark  the  sum  of  arbitrary  detector-marked  sets  by  resetting  the 
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control  bit  for  all  cells  and  subsequently  writing  it  to  "1"  for  each  detector- 
marked  set  in  turn.  It  is  possible  to  mark  the  intersection  of  detector-marked 
sets  by  writing  the  control  bit  to  "1"  for  all  cells,  subsequently  resetting  it  for 
each  detector  marked  set  and  finally  complementing  it.  It  is  possible  to  mark 
the  difference  of  sets,  minus  S2,  by  complementing  83  and  marking  the 
intersection  of  this  set  with  Sj^  by  methods  illustrated  above. 

If  a  CAM  system  contains  a  detector  plane,  a  word  commutator,  and 
is  linearly  selected  such  that  parallel  replacement  may  occur  in  detector 
marked  cells,  set  members  may  be  altered  and  retrieved  in  all  senses  defined 
in  Section  1.  1.  2.  If  this  CAM  system  is  further  mechanized  to  test  for  the 
presence  of  the  void  or  of  multiple  membered  sets  in  the  detector  plane,  all 
algorithms  listed  in  Tables  I,  II  and  III  and  described  in  Section  1.  1.6  are 
executable. 

Cryogenic  CAM  systems  having  many  of  these,  capabilities  have  been 
described  by  Seeber,  Davies  and  others.®'  8, 10, 11  jjjg  proposed  system 
differs  from  previously  described  systems  in  that  parallel  replacement  is 
generalized  to  all  bit  positions  in  memory.  Control  bits  are  stored  in  the 
memory  matrix  and  may  be  evaluated  by  the  address  operator.  For  previously 
described  CAMs,  parallel  replacement  was  possible  only  in  certain  specially 
wired  bit  positions  designated  as  mark  or  control  bits  which  were  stored 
external  to  the  memory  matrix.  When  these  capabilities  are  provided,  the 
ability  to  identify  and  transform  sets  is  significantly  extended. 

For  CAM  systems  which  are  coincidence  selected  (rather  than  linearly 
selected)  it  is  not  possible  to  replace  bits  in  parallel  over  many  cells.  The 
arithmetic  algorithms  (additions,  multiplications  and  format  conversions) 
discussed  in  Section  1.  1.6  are  not  executable.  Control  bits  must  be  stored 
external  to  the  memory  matrix.. 

For  CAM  systems  having  a  detector  plane  linked  to  a  read-write  address 
selector  but  no  word  commutator,  the  detector-marked  set  must  be  reduced  to 
a  single  member  in  order  to  access  this  member  for  reading  or  writing.  Con¬ 
ventional  coordinate  addressing  may  also  be  possible.  It  is  probable  that 
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tests  for  void  and  multiple -member ed  detector-marked  sets  would  be  incor-  y 

porated  into  this  system.  A  coincidence -selected  CAM  system  having 
essentially  this  capability  and  using  the  inequality  address  operator  has  been 

A 

described  by  Petersen  and  co-workers. 

A  CAM  system  having  a  single  detector  element  (rather  than  a  plane) 

1  9 

and  the  equality  addres.s  operator  has  been  proposed  by  Slade  and  co-workers.  '  ^ 

Data  are  read  from  this  memory  by  executing  a  serial-by-bit  match  test.  Cells 
are  location-addressed  for  writing. 

1.1.4  Proposed  CAM  System 

It  is  now  desired  to  particularize  the  description  of  CAM  to  the  extent 
necessary  for  discussion  of  execution  sequences  and  timing  for  specific 
algorithms.  A  block  diagram  for  the  proposed  CAM  system  is  shown  in 
Figure  1.  2.  The  system  has  the  following  properties: 

1)  The  equality  comparison  (using  either  the  equality  or  the 
inequality  address  operator)  is  mechanized. 

2)  Memory  elements  are  nondestructively  interrogated  and 
outputs  are  logically  combined  for  each  word  to  form  the 
address  operator.  Detector  elements  (P)  are  initially, 
cleared  to  the  matched  state  and  are  then  set  to  the  mis¬ 
matched  state  by  a  true  addreUi  opet'ator.  Mdsked  bits 
(M=0)  are  not  interrogated. 

3)  A  single  data  register  (D)  serves  the  fimction  of  exchange 
and  key  registers. 

4)  The  match  type  indicator  (MTI)  generates  one  of  three 
signals  indicating  the  multiplicity  of  matching  words. 

These  are: 

a)  No  matching  word. 

b)  Single  matching  word. 

c)  Multiple  matching  words. 

5)  A  single  uniquely  matched  word  may  be  selected  for  reading 
or  writing.  Reading  is  nondestructive.  Selection  may  be  on 
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the  basis  of  an  externally  supplied  address  as  well  as  on  the 
basis  of  detector-plane  contents. 

6)  The  address  selector  is  linked  to  the  detector  through  a  word 
commutator  which  allows  multiple  words  to  be  sequentially 
selected  for  reading  or  writing  without  repeating  the  search 
for  each  word  processed. 

7)  A  single  bit  position  for  a  set  of  nonuniquely  matched  words 
may  be  altered  to  either  the  "1"  or  "O"  state.  This  bit  may  be 
any  selected  bit  of  the  word  and  is  altered  in  parallel  for  all 
words, 

8)  A  group  of  control  bits  is  defined  for  each  word  in  CAM. 

Each  bit  designates  the  word  as  a  member  of  some  set. 

Control  bits  are  stored  within  the  memory  matrix.  They 
are  searched  and  written  in  configurations  specified  by  a 
tag  contained  within  commands  which  use  them.  The  tag 
specifies  each  control  bit  to  be  "1",  "0"  or  masked. 

The  inequality  comparison  is  chosen  because  it  allows  simple  summa¬ 
tion  of  outputs  from  memory  elements  to  generate  the  required  Exclusive  Or 
function.  Read-out  may  be  dynamic  and  pulses  need  not  be  in  time  coincidence. 
Interrogation  of  successive  bits  can  be  delayed  sufficiently  to  reduce  objec- 
tional  noise  buildup.  Masking  is  simply  accomplished  by  inhibited  interroga¬ 
tion  of  masked  bit  positions. 

The  ability  to  sequentially  select  members  of  a  matching  set  allows 
simple  loading  of  the  memory  in  instances  where  locations  of  vacant  cells  are 
unknown  to  the  program.  The  vacant  set  is  selected  by  searching  on  appro¬ 
priate  control  bits  and  words  are  sequentially  written  into  this  set.  Circuitry 
required  to  accomplish  this  task  is  directly  usable  in  the  task  of  sequential 
reading  members  of  some  selected  set. 

There  are  instances  in  which  it  is  desired  to  address  data  or  vacant 
cells  by  location  rather  than  content.  It  is  possible  to  assign  some  unique 
"serial  number"  to  some  field  of  each  cell  and  to  perform  "location  addressing" 
by  searching  for  this  serial  munber.  Alternatively  memory  addressing 
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circuitry  can  accept  an  externally  supplied  address  as  well  as  addresses 
determined  from  detector  plane  contents.  The  latter  technique  is  employed 
(without  precluding  use  of  the  former)  when  it  is  more  conservative  of  time 
and  of  bit  positions  in  memory. 

The  proposed  memory  allows  parallel  replacement  of  a  single  bit  in 
each  detector  marked  word.  Control  bits  may  thus  be  stored  in  the  memory 
matrix.  The  algorithms  to  be  presented  demonstrate  that  the  extension  of 
this  property  to  all  bits  is  a  useful  adjunct  to  the  simultaneous  search  property. 
In  the  usual  case  it  is  only  desired  to  alter  a  single  bit  of  each  selected  set 
and  circuit  considerations  may  limit  simultaneous  alteration  to  a  single  bit. 
This  property  requires  the  memory  to  be  linearly  selected. 

It  may  be  desirable  to  store  some  externally  presented  pattern  of  "1'  s" 
and  "O'  s"  into  the  detector  plane  and  thus  select  a  multi -membered  set  of  CAM 
cells.  As  an  example,  this  pattern  may  be  derived  from  some  character 
recognition  device  and  it  may  be  desired  to  correlate  it  in  some  sense  with 
CAM  contents .  Having  the  parallel  replacement  property,  direct  parallel 
entry  to  the  detector  plane  also  allows  CAM  to  be  loaded  parallel-by-word, 
serial-by-bit. 

It  is  desirable  to  distinguish  control  bits  conceptually  from  data  even 
though  both  are  stored  in  the  memory  matrix.  Some  of  the  reasons  for  making 
this  distinction  are: 

1)  Control  bits  may  be  cleared  to  "0"  throughout  memory, 
thus  erasing  set  inclusion  statements  prior  to  estab¬ 
lishing  new  statements.  This  operation  is  not  neces¬ 
sarily  performed  on  data  bits.  Initial  reset  of  the  entire 
memory  may  be  accomplished  by  sequential  writing  of  null 
words . 

2)  The  requirement  for  altering  the  tag  which  specifies 
control  bit  configurations  is  often  distinct  from  that  for 
altering  the  data  mask  and  data  key. 
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1.  1.  5 


Control  for  Proposed  CAM 


The  proposed  CAM  is  for  generality  assumed  to  be  a  component  of  a 
larger  computer  system.  A  CAM  command  is  initiated  by  first  loading  the 
data  and  mask  registers  and  inserting  an  operation  code  into  a  command 
register.  A  "START"  pulse  is  then  issued  to  the  CAM  system.  The  command 
is  executed  asynchronously  and  a  "COMPLETION"  pulse  is  sent  to  the  control¬ 
ling  computer.  Execution  times  for  CAM  commands  are  measured  from  START 
pulse  to  COMPLETION  pulse. 

The  CAM  control  consists  of  a  command  sequencer,  a  search  driver 
selector,  a  word  selector  and  a  set  of  index  registers.  The  command  sequencer 
controls  the  microsteps  required  for  execution  of  a  CAM  command.  The  search 
drive  selector  provides  sequential  delay  for  each  bit  interrogated  during  a 
search  command,  by  passing  all  masked  bits  with  some  lesser  delay.  The 
word  selector  sequentially  selects  multiple  matching  words  for  reading  or  writing. 
The  scan  register  serves  as  a  bit  index  in  certain  commands  which  are  executed 
serial-by-bit.  Each  of  these  devices  is  more  fully  described  in  the  following 
paragraphs . 

The  command  sequencer  (Figure  1,3)  contains  a  command  register, 
state  counter.  Command  decoder  and  an  aperiodic  pulse  generator.  The  command 
register  stores  an  operation  code  and,  together  with  "branch"  and  "end"  signals 
from  the  CAM  system,  controls  the  sequence  of  states  for  the  state  counter. 

The  command  decoder  translates  contents  of  the  state  counter  and  command 
register  into  elementary  control  signals  used  throughout  the  CAM  systeih  and  in 
particular  to  select  time  delays  between  successive  clock  pulses. 

The  clock  generator  is  triggered  by  the  START  pulse  and  at  each 
following  interval  by  one  of  a  set  of  delay  generators  (Ti  through  Tn  in  Figure  1. 3). 
Each  delay  generator  has  logical  inputs  (t)  and  (i).  When  triggered  at  the  logical 
input  (t),  a  delay  generator  transmits  a  trigger  pulse  to  the  clock  generator  after 
some  preset  delay.  When  triggered  at  the  logical  input  (i),  the  delay  generator 
transmits  a  trigger  pulse  to  the  clock  generator  with  no  delay.  The  (t)  input  is 
normally  triggered  by  a  clock  pulse  ANDed  with  a  state  and  an  instzniction  code. 
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following  interval  by  one  of  a  set  of  delay  generators  (Ti  through  Tn  in  Figure  1, 3). 
Each  delay  generator  has  logical  inputs  (t)  and  (i).  When  triggered  at  the  logical 
input  (t),  a  delay  generator  transmits  a  trigger  pulse  to  the  clock  generator  after 
some  preset  delay.  When  triggered  at  the  logical  input  (i),  the  delay  generator 
transmits  a  trigger  pulse  to  the  clock  generator  with  no  delay.  The  (t)  input  is 
normally  triggered  by  a  clock  pulse  ANDed  with  a  state  and  an  instruction  code , 
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In  this  mode  the  delay  generator  controls  the  duration  of  the  state  which  is 
entered  on  the  clock  pulse  triggering  the  delay  generator.  The  (i)  input  is 
normally  triggered  by  some  completion  signal  developed  within  CAM  for 
operations  of  variable  time  duration.  Both  inputs  of  a  delay  generator  may 
be  wired,  allowing  normal  duration  for  a  state  for  normal  circumstances  and 
premature  termination  if  special  conditions  are  met. 


The  search  drive  control  provides  sequential  delay,  Tsbdl  for  each 
bit  interrogated  (M^  =  "1")  during  a  search  operation,  by -passing  masked  bits 
(Mi  =  "O")  with  some  lesser  delay,  Tsbd2.  In  many  memory  mechanizations  this 
delay  is  of  substantial  use  in  reducing  noise  buildup  at  matched  cells.  This 
circuit  is  repeated  for  each  bit  of  the  word  including  control  bits.  A  comple¬ 
tion  signal  is  developed  at  the  final  stage  of  the  drive  control. 

The  word  commutator  sequentially  selects  one  of  a  set  of  matching 
words  for  reading  or  writing.  Interconnection  of  detector  plane  storage 
element,  word  driver  and  word  commutator  element  for  some  "jth"  CAM 
word  is  shown  in  Figure  1. 4.  This  circuit  is  repeated  for  each  word  cell  in 
CAM.  The  signal  "C."  is  propagated  through  a  single  element  in  the  commu- 

J 

tator  chain  in  a  time,  Tswd.  A  "select  complete"  signal  is  generated  to 
indicate  the  time  at  which  unique  address  selection  is  achieved. 
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FIGURE  1.4 
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Pj  =  1  denotes  that  the  "jth"  detector  element  is  matched. 
Cj  =  1  denotes  that  none  of  words  j  through  h  are  selected. 
Sj  =  1  denotes  that  the  "jth"  word  is  selected 
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The  set  of  index  registers  is  used  to  form  auxiliary  masks  and  keys 
for  compound  commands  and  to  count  steps  in  these  commands.  Each  index 
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register  consists  of  an  index  counter  which  addresses  some  bit  position  in  the 
data  or  mask  registers  and  of  two  limit  registers.  Limit  registers  are  loaded 
from  an  index  word  presented  by  the  controlling  computer.  A  pair  of  limit 
registers  define  a  contiguous  field  of  bit  positions  in  CAM.  Bit  positions  are 
considered  consecutively  nvunbered  starting  at  the  least  significant  bit  (LSB) 
of  the  cell.  The  address  of  the  LSB  of  the  field  is  stored  in  the  lower  limit 
registers;  the  address  of  the  most  significant  bit  (MSB)  of  the  field  is  stored 
in  the  upper  limit  register.  All  bit  positions  having  numbers  greater  than  or 
equal  to  the  LSB  but  less  than  or  equal  to  the  MSB  are  considered  in  the  field. 
The  CAM  control  and  the  controlling  computer  can  load  each  index  counter 
from  one  of  its  limit  registers,  compare  each  counter  to  its  upper  or  lower 
limit  register  and  increment  or  decrement  counters  and  limit  registers.  Bits 
in  D  and  M  addressed  by  an  index  counter  may  be  altered  or  tested  by  the  CAM 
control  or  the  controlling  computer. 

1.1.6  CAM  Commands 

The  controlling  computer  can  issue  some  niunber  of  data  processing 
and  control  commands  to  CAM .  These  commands  may  be  embedded  into  the 
conventional  command  structure  of  this  machine,  hence  read  from  memory 
during  an  instruction  cycle  and  then  executed.  Alternatively,  the  CAM  system 
may  be  part  of  a  "variable  structure"  computer  as  proposed  by  G.  Estrin.^^ 
Commands  would  then  be  sequenced  for  any  given  problem  by  a  supervisory 
control  interacting  with  the  variable  structure  system.  Some  CAM  commands 
are  executed  over  the  entire  contents  of  CAM.  Other  commands  are  executed 
over  some  "input"  set  of  detector-marked  CAM  words  where  the  detector  is- 
assumed  to  have  been  set  by  some  previous  search.  The  detector-marked 
set  at  completion  of  a  command  is  designated  as  the  "output"  set. 

Commands  are  designated  as  "basic  commands"  or  as  "compound 
commands".  Each  basic  command  is  a  useful  program  step  and  the  set  of 
basic  commands  serves  as  building  blocks  for  coinpound  commands.  Both 
basic  and  compound  commands  are  executed  in  response  to  a  single  instruction 
from  the  controlling  computer.  The  CAM  control  sequences  basic  commands 
to  execute  a  compound  command. 
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A  set  of  basic  CAM  memory  commands  and  basic  logical  commands 
are  listed  in  Tables  1  and  II,  respectively.  A  set  of  compound  commands  is 
listed  in  Table  III.  Both  command  sets  are,  of  course,  extendable.  The 
search  command  (SCH)  and  the  maximization  command  (MAXA)  are  then 
discussed  in  greater  detail  in  the  following  pages.  The  notation  C(R).  defined 
as  "contents  of  register  R"  is  used  in  description  of  commands.  The  search 
command  (SCH)  together  with  marking  commands  (CLP,  WSB,  CLC)  allows 
selection  of  a  set  having  any  given  character.  The  basic  write  commands 
(WFW  and  WSB)  allow  arbitrary  transformation  of  the  selected  set.  By 
repeated  use  of  these  commands,  it  is  possible  to  perform  arithmetic  and 
logical  operations  on  sets  of  CAM  cells.  Operations  are  usually  executed 
serial-by-bit  and  parallel-by-word  hence  execution  time  is  not  strongly 
dependent  on  the  number  of  words  processed. 

The  controlling  computer  is  assumed  to  contain  an  arithmetic  unit  in 
which  it  is  possible  to  execute  the  usual  arithmetic  and  logical  operations  on 
single  words  at  far  greater  speeds  than  they  can  be  executed  in  CAM.  A  set 
of  compound  commands  was  selected  to  have  some  expectation  of  being 
executed  over  large  sets  of  words  with  a  view  to  determining  the  feasibility 
of  parallelism  sufficient  to  outperform  the  conventional  arithmetic  organ.  In 
a  later  report  consideration  will  be  given  to  any  further  properties  necessary 
to  permit  CAM  to  perform  in  a  manner  equivalent  to  networks  of  computers 
under  a  common  control. 

The  list  of  compound  commands  is  merely  indicative  of  some  possible 
choices.  It  is  obvious  that  any  compound  command,  treated  here  as  a  wired 
command,  can  be  synthesized  by  repeating  basic  or  lower  level  compound 
commands  under  program  control.  The  choice  of  wiring  a  compound  command 
is  the  common  choice  of  execution  time  vs.  hardware  complexity. 

In  the  remainder  of  this  section  timing  equations  for  basic  and  com¬ 
pound  commands  are  described.  One  basic  command  (SCH)  and  one  compound 
command  (MAXA)  are  discussed.  For  these  commands  a  flow  chart  is  presented 
which  defines  elementary  events  and  their  sequence  of  occurrence.  Elementary 
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TABLE  II 

BASIC  LOGICAL  COMMANDS 


Mnemonic  Code 


Command 


1)  LDM  (Y) 

(Load  M) 

2)  LDD(Y) 

(Load  D) 

3)  STOD(Y) 

(Store  D)  ^ 

4)  TDMZ  • 

(Test  D  or  M  for  Zero) 

5)  LXLR(Y) 

(Load  Index  Limits) 

6)  LK  ■ 

(Load  Index) 

7)  LKSL 

(Load  IndeXj  Step  Limit) 


Some  fraction  of  the  M  register  is  loaded  with 
C(Y)  from  memory  of  the  controlling  computer. 

Some  fraction  of  the  D  register  is  loaded  with 
C(Y). 

Some  fraction  of  C(D)  is  stored  at  address  Y. 


8)  TK 

(Test  Index) 

9)  SDMI 

(Set  D  and  M) 

10)  TDMI 

(Test  D  and  M) 

11)  MTIT 
(Test  MTI) 


The  C(D)  or  the  C(M)  are  tested  for  zero. 

Index  limit  registers  are  loaded  with  C(Y). 

Index  counters  are  loaded  from  specified  limit 
registers. 

Index  counters  are  loaded  from  specified  limit 
registers.  Limit  registers  are  incremented 
(decremented)  by  one. 

Contents  of  a  specified  index  counter  are  tested 
relative  to  some  limit  register. 

Data  and  mask  registers  are  set  in  positions 
addressed  by  index  counters. 

Data  or  mask  register  is  tested  in  position 
addressed  by  some  index  counter. 

Match  type  indicator  is  tested  for  zero,  one,  or 
greater  than  one . 


COMPOUND  COMMANDS 


CO 

fiO 


"S 

4J 

0)  . 
M  (0 

•M 

0) 

CO 

•« 
u  §. 

» ^ 
t|  0 

a 

"3 

o 

v 

1.  ^ 
§•8 
•-<  O 

G 

0 

•H 

•M 

G 

O 

•H 

•M 

■ 

0)  > 
s| 

g.^ 

E 

E 

•«-• 

0 

« t: 

H 

0) 

Q 

U 

frt 

0) 

CO 

G  S! 

m 

s 

•H 

4-* 

•  •H 

° 

't?  r? 

« ji 

G 

0 

•M 

A 

H  CD 

§  6  u 

I— 4  .  >  H 

O  tj  3 

O  3  O 

•o  "O 

e  p  o 
«|  s,  ^ 
0)  {9 

S  “  5 

S  ■g  -g 

d)  O 

«>  a  2 

o|  ^ 
^  .5 

o  >3  m 

fi  o  ® 

5  V  0) 

6  «w  o 

S 

I  ^ 

^  -M  0) 


W  4»  ;i(j  «  «  rt 

o2  Sd  8 <2  I 


O 

t*  00  0>  pH 


mS  8 


f 


events  are  defined  assuming  that  CAM  is  structured  as  in  the  block  diagram 
of  Figure  1.2.  On  each  flow  chart  the  state  sequence  for  CAM  is  indicated 
and  the  set  of  elementary  events  occurring  within  a  single  CAM  state  is  shown. 

The  controlling  computer  presents  data  and  mask  register  contents; 
an  index  word  and  a  command  code  to  CAM  then  issues  a  START  pulse.  The 
CAM  system  executes  the  command  then  issues  a  COMPLETION  pulse.  Each 
command  sequence  is  described  for  the  interval  between’ 'START  ahd 
COMPLETION  pulse.  Some  arbitrary  command  (CMD)  is  executed  in  a  time, 
Tcmd,  starting  and  ending  in  ST  0  with  an  intermediate  state  sequence,  ST  1 
through  ST  n.  Each  intermediate  state  is  assigned  a  time,  rcmdl  through 
Tcmdn. 

Each  state  time,  t,  is  a  sum  of  elementary  times.  Elementary  times 
are  defined  on  the  flow  chart  for  each  command  and  may  be  common  to  several 
commands.  Subscripts  for  elementary  times  are  chosen  to  be  of  mnemonic 
significance.  State  times  may  consist  of  a  fixed  number  of  elementary  times, 
hence  be  of  a  fixed  duration,  determined  by  a  delay  generator.  State  times 
may  also  consist  of  a  variable  number  of  elementary  times,  hence  be  of 
(  variable  duration,  determined  by  a  completion  pulse. 

Any  logical  operation  on  contents  of  the  mask,  data  or  scan  register 
is  executed  in  a  single  state  time,  tl,  and  an  elementary  time,  tj^,  for  all 
operations  and  all  commands.  Any  branch  condition  for  control  states. 

Implied  by  a  test  on  contents  of  the  mask,  data  or  index  register  or  on  con¬ 
tents  of  the  match  type  indicator,  is  evaluated  in  a  state  time,  and  an 
elementary  time,  tj^,  which  may  overlap  other  logic  states.  Each  state 
(including  the  initial  state)  is  assigned  an  elementary  "advance  time",  t^^,  to 
allow  for  transition  of  the  state  counter  into  the  following  state  and  for  settling 
of  control  lines  in  this  state.  The  state  time  for  any  logic  state  is 

Tt  =  t^  +  t 

L  L  sa 

Each  command  has  a  transition  time,  =  tg^^,  from  its  initial  state.  A  basic 
command  (CMDB)  is  executed  in  a  time 
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n 

'^CMDB  "  “^0  X  "^CMDi 
1=1 

where  n  is  the  number  of  intermediate  states.  Under  some  conditions  a  basic 
command  may  be  skipped;  hence  be  executed  in  time,  Tq.  A  compound  command 
(CMDC)  exists  as  a  series  of  basic  commands  (CMDBl  through  CMDBN)  with 
intervening  logic  states .  A  compound  command  is  executed  in  a  time 

"l  ”n 

"^CMDC  ^  “^0  "l  "^CMDBli  “n  '^CMDNi  ^ 

1=1  1=1 

where  n^  through  nj^  and  Oj  through  af^  are  respectively  the  number  of  inter¬ 
mediate  states  in  and  the  number  of  iterations  of  basic  commands  CMDBl 
through  CMDBN  and  i  is  the  number  of  logic  states  in  the  compound  command. 

Search  Command  --  SCH<Tag) 

The  SCH  command  searches  the  memory  matrix  for  words  which  match 
C(D)  in  bit  positions  where  C(M)  are  1.  No  match  test  is  made  in  bit  pxisitions 
where  C(M)  are  0.  The  tag  consists  of  control  bits  which  may  be  specified  to 
be  either  1  or  0  or  may  be  unspecified.  A  match  test  is  made  on  all  specified 
control  bits.  Contents  of  the  memory  matrix  and  of  mask  and  data  registers 
are  unaltered.  Match  tests  are  sequenced  by  the  search  drive  control  discussed 
in  Section  1.  1.4. 

The  match  detector  <P)  is  initially  set  to  1  and  then  is  set  to  0  in  word 
locations  where  one  or  more  match  tests  are  not  satisfied. 

The  match  type  indicator  (MTI)  indicates  the  multiplicity  of  matching 

words . 

The  event  and  state  sequences  for  a  SCH  command  are  shown  in  Flow 
Chart  1.  The  total  time  to  search,  a  set  of  words,  each  containing  ''n"  bits, 
where  the  search  is  performed  on  k  bits  (k  s  n)  is: 

T  ■  T  +  T  +  T 

sch  ’^o  ’^schl  8ch2 

where 
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EVENT  SEQUENCE 


STATE  SEQUENCE 


FLOW  CHART  -  I 
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"^schl  ^clp 


r  uo  ~  ^  ‘  t  ,  ,,  +  (n-k)  t  .  +t  ,  +  t  .  +  t 
sch2  sbdl  sbd2  sslp  sat  ssmp 

+ 1  + 1  .  + 1 
ssn  mti  sa 


^sbdl  "  interrogation  delay  per  bit 
tgbd2  "  bypass  delay  per  bit 


Absolute  Maximum  Command  --  MAXA 

The  MAXA  command  locates  members  of  some  selected  input  set 
which  store  the  maximum  absolute  numeric  value  for  some  specified  field 
and  determines  the  maximum  value  in  this  field. 


The  command  is  executed  over  cells  having  a  true  detector  element. 

The  field  over  which  maximization  occurs  is  defined  by  the  contents  of  the 
upper  and  lower  limit  registers  for  some  index  resistor  as  described  in  Sec¬ 
tion  1. 1. 4.  Bit  positions  not  in  the  maximization  field  are  not  examined.  The 
specified  field  in  each  CAM  cell  is  interpreted  as  an  unsigned  positionally 
weighted  number  having  the  weight  associated  with  the  bit  position. 

The  index  counter  stores  the  index  k  which  runs  over  bit  positions  of 
the  specified  field  starting  at  the  most  significant  bit.  Allowable  number  sys¬ 
tems  are  subject  to  the  restriction  that  the  weight  a^^  has  a  unique  value  larger 
than  the  value  of  any  number  generated  using  only  lesser  weights.  Some  number 
representations  which  satisfy  requirements  are; 

1)  Fixed  point  binary  representation.  If  numbers  are  2' s 
complement,  magnitude  must  be  minimized  for  the  set  of 
negative  numbers. 

2)  Fixed  point  BCD  representation. 

3)  Normalized  floating  point  with  leading  characteristic. 

Following  execution  of  a  MAXA  command,  detector  elements,  P,  are 
"1"  for  CAM  cells  in  the  selected  input  set  which  contain  the  maximum  value  for 
the  specified  field  and  are  "0"  elsewhere.  The  MTI  indicates  the  multiplicity  of 
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words  having  this  maximum  value  for  the  specified  field. 

The  MAXA  command  is  a  compound  command,  executed  as  a  series 

of  masked  equality  searches.  The  unmasked  search  key  (maximum  value  of 

the  specified  field)  is  generated  serial-by-bit  starting  at  the  most  significant 
til 

bit.  Some  "k  "  search  is  executed  over  the  set  of  selected  CAM  words  which 
match  the  key  in  bit  positions  1  through  k-1  of  the  specified  field.  This  set  is 
searched  for  words  having  a  "l"  in  the  "k*^"  bit  position.  If  the  set  satisfying 
"k  "  search  has  several  members,  the  value  of  the  specified  field  is  taken  as 
"1"  in  the  "k^^"  bit  position  and  the  set  satisfying  the  "k^^"  search  is  taken  as  the 
input  set  for  the  "(k+l)st"  search.  If  the  set  satisfying  the  search  is 

vacant,  the  maximum  value  for  the  specified  field  is  taken  as  "O"  in  the  "k^^" 
bit  positions;  this  bit  position  is  deleted  from  subsequent  searches,  and  the 
input  set  for  the  "k^^"  search  is  taken  as  the  input  set  for  the  "(k+l)st"  search. 

If  the  set  satisfying  the  "k*^"  search  contains  a  single  member,  the  algorithm 
is  terminated. 

The  upper  and  lower  limit  registers  for  the  chosen  index  register  are 
loaded  as  the  MAXA  command  is  issued.  The  index  counter  is  initially  loaded 
with  contents  of  the  upper  limit  register  and  is  decremented  by  one  for  each 
search  of  the  maximization  field.  The  algorithm  is  terminated  when  a  unique 
maximum  is  located  or  when  contents  of  the  index  counter  equal  the  contents  of 
the  lower  limit  register. 

The  event  and  state  sequence  for  the  MAXA  command  are  illustrated  in 
Flow  Chart  II.  Where  a  basic  command  is  indicated  as  an  event,  the  event  and 
state  sequence  for  that  command  are  implied. 

Since  the  MAXA  command  is  terminated  at  the  first  step  for  which  a 
unique  maximum  is  determined,  execution  time  for  the  command  is  in  general 
dependent  on  the  distribution  of  numerical  values  for  words  in  the  selected  set 
as  well  as  on  the  length  of  the  specified  field.  The  maximum  execution  time 
for  a  MAXA  command,  executed  over  a  field  of  length  P,  occurs  when  the  maxi¬ 
mization  field  is  zero  for  all  cells  in  the  selected  input  set. 
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MAXA  COMMAND.  THE  DOTTED  LINE  BY 
THE  MINA  COMMAND. 


8TO(To) 


CLC  STATES 


CPP  STATES 


STKTl) 


ST2(Tl) 


SCH  STATES 


ST3(T|.) 


ST4(Tl) 


STD 


FLOW  CHART  -  II 
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maxa 
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T  +  T  ,  +  7!  (P+1)  T,  ’’’  ui  +  (3P+1)  T, 

o  clc  cppi  /-<,  schi  1 
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(4P  +  8  states) 

The  minimum  execution  time  for  a  MAXA  command,  executed  over  a  field  of 

length  P,  occurs  when  a  unique  maximum  is  located  at  the  first  search  and  is: 

3 

’^cppi  ^^^j%chi  ^  ^  ’'l 


2 

r  “  T  +  T'  +  y. 

maxa  ,  o  clc 

min  1=1 


(9  states) 
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1-2  Assignment  Problems 


The  concept  of  a  fixed  plus  variable  structure  computer  system,  (F+V) 

1  2 

was  proposed  in  1959  by  G.  Estrin,  '  at  UCLA.  The  primary  goal  of  the 
F+V  system  organization,  as  stated  by  Estrin,  is:  "to  permit  computations 
which  are  beyond  the  capabilities  of  present  systems  by  providing  an  inven¬ 
tory  of  high  speed  substructures  and  rules  for  interconnecting  them,  such 
that  the  entire  system  may  be  temporarily  distorted  into  a  problem -oriented 
special  purpose  computer".  Equivalently,  one  could  state  that  the  goal  of  the 
F+V  system  organization  is  to  extend  the  class  of  practicably  computable  prob¬ 
lems,  where  "practicable  computability"  of  a  problem,  is  primarily  a  function 
of  programming  time,  computation  time,  and  cost  of  computing. 

The  F+V  system  aims  to  achieve  the  above  goals  by  reducing  the  com¬ 
putation  time  of  a  problem  as  a  result  of  a  combination  of  the  computational 
power  and  flexibility  of  a  general  purpose  computer  (F)  with  the  speed  of  simul¬ 
taneously  operating  special  purpose  structures  (V).  The  general  purpose  (GP) 
computer  is  referred  to  as  "fixed"  in  the  sense  that  its  hardware  has  been  con¬ 
nected  into  a  structure  which  is  expected  to  vary  very  little  during  the  life-time 
of  the  computer,  and  thus  does  not  require  specification  of  the  interconnections 
of  its  elements  along  with  the  statement  of  the  problem.  This  property  permits 
development  of  programming  languages  for  the  GP  computer  resulting  in  easier 
communication  between  the  user  and  the  machine  at  some  expense  in  pure  com¬ 
putational  speed. 

The  role  of  F  in  the  F+V  system  is,  in  addition  to  being  a  powerful  com¬ 
puting  unit,  to  contain  the  complex,  though  not  in  general  most  time  consuming 
parts  of  the  program  for  the  problem,  perform  most  of  the  input -output  opera¬ 
tions,  amd  to  permit  the  use  of  already  developed  higher  programming  languages 
as  well  as  the  use  of  large  program  and  subroutine  libraries  associated  with  a 
reliable  modern  GP  computer. 

The  variable  structure  computer,  V,  is  used  to  gain  speed  in  computation 
by  employing  special  purpose  computation  techniques  or  generally  less  complex 
iterative  procedures.  The  variable  nature  of  the  interconnections  of  its  hardware 
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permits  utilization  of  the  same  hardware  for  different  structures,  a  neces¬ 
sary  condition  to  make  the  F+V  system  economically  feasible.  The  more 
cpmmon  of  the  special  purpose  techniques  to  be  used  in  V  structures  are 
wired  programs,  and  problem -oriented  organizations  of  the  hardjyare  which 
may  utilize,  wherever  effective,  unconventional  interconnection  of  computing 
circuits  smd  unconventional  number  representation. 

In  order  to  combine  the  high  speed  of  special  purpose  computers  with 
flexibility  required  in  efficient  computation,  real  time  structure  changes  in  V 
must  be  permitted. 

These  structure  changes  may  be  effected  by  electronic  or  electro¬ 
mechanical  switching  of  the  hardware  into  preconstructed  structures,  or  by 
actual  physical  change  in  the  interconnections  of  V  hardware.  The  latter  changes 
are  usually  quite  time-consuming,  even  though  most  of  these  may  only  involve 
substitution  of  units  previously  constructed  off -machine.  Nevertheless,  such 
a  mechanical  change  in  V  structure  may  result  in  considerable  reduction  of 
overall  computation  time  and  may  be  the  deciding  factor  in  rendering  a  problem 
practicably  computable. 

Equivalent  to  a  sub -routine  library  in  a  GP  computer,  V  will  have  a 
library  of  substructures  of  more  common  functions.  Some  of  the  more  fre¬ 
quently  used  structures  will  usually  be  electronically  switchable  in  the  structure 
for  a  given  problem. 

I 

Whenever  it  is  not  distorted  into  a  form  for  application  to  a  particular 
problem,  V  is  in  a  "standard  state"  configuration.  In  the  standard  state  a  num¬ 
ber  of  electronically  switchable  configurations  are  available  for  efficient  per¬ 
formance  of  more  common  elementary  operations  or  for  computation  of  ele¬ 
mentary  functions.  In  this  state  V  may  be  considered  as  an  extension  of  F  for 
computing  these  functions  as  macro -commsmds,  or  it  may  be  used  to  execute 
independent  programs. 

In  a  non-trivial  application  of  the  F+V  system,  F  and  one  or  more  con¬ 
figurations  of  V  (constrained  by  the  size  of  its  inventory)  will  compute  in  parallel. 
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Their  interaction  is  controlled  by  a  separate  supervisory  control  unit.  SC, 
which  enforces  cooperation  or  synchronization  of  processes  going  on  in  F 
and  V.  The  SC  monitors  operations  in  F  and  V  and  arranges  for  interlocks 
and  data  transfers  between  these.  It  is  in  direct  communication  with  V  via 
an  appropriate  set  of  control  signals,  and  uses  interrupt  features  and  a  set 
of  special  commands  of  F  to  communicate  with  the  latter.  At  this  stage, 
the  control  functions  of  the  SC  will  be  kept  as  simple  as  possible;  in  concept, 
however,  it  could  conceptually  involve  a  complex  computer  in  its  own  right. 

The  characteristics  of  F,  V  and  SC  are  discussed  in  detail  below.  A 
block  diagram  of  the  F+V  system  is  given  in  Figure  1.  5. 

1.  2.  1  The  Fixed  Structure  Computer,  F. 

Some  of  the  reasons  for  including  a  fixed  structure  GP  computer  in 
the  F+V  system  were  discussed  in  the  previous  section.  Briefly,  all  parts 
of  the  problem  where  no  significant  gain  in  speed  can  be  expected  through 
special  purpose  techniques  . will  be  left  for  computation  in  F,  V  will  be  used 
to  mechanize  only  these  parts  of  the  problem  which  require  the  major  (generally 
iterative)  portions  of  its  overall  computing  time. 

Similarly,  all  operations  which  F  is  better  equipped  to  handle,  such 
as  most  of  the  input -output  operations,  will  be  delegated  to  it. 

Examples  of  analysis  of  complex  problems  and  assignment  of  tasks  to 
F  and  V  are  to  be  found  in  References  3,  4,  5,  6,  7,  8  and  9. 

TheTirst  choice  for  F  was  the  IBM  7090  solid  state  computer.  This 
choice  was  based  on  the  fact  that  the  IBM  7090  is  a  modern  fast  computer  with 
a  large  and  versatile  list  of  instructions,  independent  input -output  processing, 
a  large  number  of  auxiliary  storage  and  input -output  units,  and  readily  available 
(i.  e.  ,  already  on  UCLA  campus).  The  FORTRAN  programming  language  is 
convenient  to  use,  and  a  large  number  of  programs  and  subroutines  are  avail¬ 
able  from  many  7090  users  through  the  SHARE  organization.  Further  detailed 
information  about  the  characteristics  of  IBM  7090  can  be  obtained  from  Ref.  10. 

In  the  following  discussion  any  reference  to  F  implicitly  refers  to  the 
IBM  7090. 
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BLOCK  DIAGRAM  OF  THE  F+V  SYSTEM 
FIGURE  1.5 


1.  2.  2  The  Variable  Structure  Computer,  V. 

It  was  already  stated  that  V  is  a  collection  of  hardware,  usually  in 
the  so-called  "standard  state",  which  can  be  restructured  into  problem - 
oriented  special  purpose  structures.  The  principal  special  purpose  tech¬ 
niques  to  be  used  are  wired  program  control,  specialized  computing  circuits, 
euid  unusual  number  representation,  all  of  which  usually  cannot  be  incorporated 
directly  in  a  GP  computer  structure,  and  which  are  feasible  in  V  only  due  to 
sharing  of  hardware  between  different  structures. 

During  a  computation  V  may  be  in  the  standard  state  or  in  one  or 
several  problem -oriented  structures,  and  it  may  change  structure  several 
times  before  the  computation  is  completed.  The  structure  changes  may  be 
effected  by  electronic  or  electro -mechanical  switching,  or  by  actual  physical 
rearrangement  of  interconnections.  Each  structure  contains  its  own  control 
units  which  communicate  with  a  higher  order  control  unit. 

1.  2.  3  The  Standard  State 

In  the  standard  state  the  current  Inventory  of  V  is  used  to  increase 
the  Instruction  list  of  F  with  more  commonly  used  operations.  These  opera¬ 
tions  can  be  performed  in  V  using,  any  unconventional  methods  which  permit 
faster  execution  of  the  operation. 

Among  the  commonly  used  operations  which  may  be  Included  in  the 
standard  state  of  V  are  such  elementary  functions  as  trigonometric  functions, 
inverse  trigonometric  functions,  logarithmic  and  exponential  functions,  n-th 
root  or  power,  hyperbolic  functions,  Bessel  functions,  differentiation  and  inte¬ 
gration,  random  number  generation,  and  statistical  operations;  complex  arith¬ 
metic,  vector  arithmetic;  matrix  operations;  and  non -arithmetic  operations. 
Some  of  the  listed  operations  may  be  available  in  single,  multiple,  or  partial 
precision.  Simultaneous  computation  of  certain  subsets  of  the  above  operations 
may  be  Incorporated  in  the  standard  state.  For  a  given  operation  a  choice  be¬ 
tween  high-speed  or  low -speed  structures  which  require  large  or  small  amounts 
of  hardware,  respectively,  may  be  hvailable. 
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If  included  in  the  standard  state,  such  operations  can  be  performed 
by  electronic  switching  of  the  standard  state  control  unit(s),  and  thus  very 
rapidly  available.  The  number  of  such  operations  actually  included  in  the 
standard  state  is  a  function  of  available  inventory  and  may  be  small  in  the 
initial  standard  state.  However,  as  V  inventory  is  increased,  the  power  of 
the  standard  state  is  increased  accordingly. 

Based  on  an  extrapolation  of  the  hardware  requirements  established 
during  problem  studies  (see  reference  above)  the  V  hardware  inventory  is 
eventually  expected  to  permit  construction  of  the  following  simultaneously 
operating  units:  a  144  bit  parallel  arithmetic  unit,  360  bits  of  shifting  reg¬ 
isters,  144  bits  of  comparators,  27  bits  of  counters,  and  control  units  con¬ 
taining  1000  logical  elements.  In  addition,  approximately  8  simultaneously 
accessible  very  high  speed  memory  units  (0.  Spsec.  access  time)  of  1000  72-bit 
words  each,  2  simultaneously  accessible  16,000  72-bit  high  speed  (Ijusec. 
access  time)  memory  units,  a  1000  word  content  addressable  memory,  and 
a  large  slower  back-up  memory  (e.  g. ,  a  disc  memory  unit)  are  expected  to 
be  contained  in  V. 

1.  2.  4  The  Control  Hierarchy 

The  execution  of  computations  in  V  is  controlled  by  a  number  of  control 
units.  Since  some  of  the  control  units  can  exercise  control  over  others,  a  con¬ 
trol  hierarchy  (or  control  tree)  is  formed.  Each  of  the  control  units  in  the 
hierarchy,  in  general,  contains  circuitry  for  electronic  switching  of  its  wired 
program  such  that  it  may  be  used  to  control  several  different  programs.  Use 
of  stored  program  control  units,  however,  is  not  entirely  ruled  out. 

A  typical  configuration  of  the  control  tree  is  depicted  in  Figure  1.6. 

On  the  lowest  control  level  are  control  units  which  control  such  common 
arithmetic  and  logical  operations  as  addition,  subtraction,  multiplication, 
division,  square  root,  floating  point  to  fixed  point  and  vice  versa  conversion, 
and  possibly  others.  Each  of  the  arithmetic  operations  may  be  performed  in 
fixed  or  floating  point,  single,  double  or  partial  precision.  The  exact  nature 
of  such  a  control  unit  depends  on  the  arithmetic  schemes  used  but,  in  general. 
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F+V  CONTROL  fflERARCHY 
FIGURE  1.6 
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it  corresponds  in  complexity  to  the  arithmetic  control  unit  of  a  large  scale 
general  purpose  computer.  Control  units  on  the  lowest  control  level  can  gen¬ 
erate  only  elementary  commands,  i.e.,  commands  which  cause  only  a  single 
action  to  take  place  (e.  g. ,  a  transfer  from  one  register  to  another,  a  shift, 
counting  by  1).  The  particular  operation  executed  by  such  a  control  unit  is 
specified  by  a  compound  command  generated  by  a  control  unit  on  a  higher  con¬ 
trol  level. 

On  the  next  control  level  are  so-called  "subroutine"  control  units. 

These  control  units  may  execute  elementary  functions,  complex  arithmetic, 
vector  algebra,  etc.  Each  control  unit  on  this  control  level  may  use  the 
operations  controlled  by  lower  level  control  units  as  compound  commands  (e.g. 
may  generate  control  signals  which  specify  "add,  single  precision,  floating 
point",  etc.),  and  also  generate  all  elementary  commands.  The  wired  pro¬ 
grams  of  a  subroutine  control  unit  usually  correspond  to  a  stored  program 
subroutine  in  a  general  purpose  computer. 

A  number  of  consecutively  higher  control  levels  may  exist,  each  of 
which  may  specify  operations  performed  by  lower  level  control  units  as  com¬ 
pound  commands.  Some  of  these  control  units  may  use  stored  programs.  On 
the  highest  control  level,  however,  is  the  supervisory  control  unit. 

The  purpose  of  the  supervisory  control  is  to  supervise  the  execution  of 
the  computations  in  F  and  in  V,  .to  coordinate  information  exchanges  between 
F  and  V;  between  F  or  V  and  peripheral  equipment,  and  between  the  latter  them 
selves;  and  to  perform  interlocking  functions.  For  efficient  performance  of 
this  task,  the  supervisory  control  unit  may  contain  a  wired  program  along  with 
stored  programs  in  both  F  and  V. 

Each  of  the  control  units,  regardless  of  the  control  level,  has  to  per¬ 
form  three  basic  functions;  (1)  generate  a  sequence  of  states  which  correspond 
to  sequential  operations  specified  by  the  program  being  mechanized,  (2)  during 
each  sequence  state,  generate  a  subset  of  available  elementary  and  compound 
commands,  where  the  commands  in  the  subset  correspond  to  those  operations 
in  the  program  which  may  be  performed  simultaneously,  (3)  in  each  sequence 
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state  provide  for  generation  of  the  next  sequence  state  as  specified  by  the 
program.  The  next  sequence  state  may  be  different  from  the  natural  sequence 
due  to  unconditional  or  conditional  branching  specified  by  the  program. 

Of  the  enumerated  functions,  only  the  first  can  be  problem -independent 
if  a  large  enough  number  of  sequence  states  is  available.  The  other  two, 
necessarily,  depend  on  the  algorithms  being  mechanized  and  thus  represent 
what  is  equivalent  to  a  stored  program  in  a  general  purpose  computer. 

Consequently,  for  a  fixed  hardware  structure  (as  is  the  case  for  com¬ 
puting  many  types  of  elementary  functions  where  the  hardware  is  in  the  usual 
arithmetic  unit  configuration)  a  change  in  the  computational  algorithm  (program) 
implies  a  change  in  the  command  generating  structure  as,  in  general,  different 
commands  are  generated  in  the  same  sequence  state  for  different  algorithms, 
and  a  change  in  the  branching  structure  as  different  algorithms  require  different 
branching.  These  changes  in  the  control  unit  will,  most  likely,  be  effected  by 
mechanical  means  if  the  control  structure  is  on  a  sufficiently  high  level  of  the 
control  hierarchy,  and  by  electronic  means  on  the  subroutine  and  arithmetic 
operation  levels. 

There  are  several  courses  of  action  open  for  changing  of  the  branching 
logic  and  command  matrix  structures.  A  very  general  approach  would  involve 
establishing  branching  and  command  matrices  which  permit  transition,  from  a 
given  sequence  state  to  any  other  sequence  state  for  any  one  of  the  completion 
signals  and  control  register  states,  and  generation  of  any  command  for  a  given 
sequence  state  and  state  of  the  control  register,  respectively.  Given  such 
matrices,  a  wired  program  could  be  established  by  enabling,  proper  subsets  of 
intersections  of  the  branching  and  command  matrices  under  the  control  of  a 
plug -board,  punched  card,  or  a  photo-electric  switching  device. 

The  major  advantage  of  such  a  general  switching  scheme  is  the  speed 
of  restructuring  of  the  control  unit  for  different  uses  of  the  F+V  system  and 
the  low  cost  of  doing  so.  A  disadvantage  is  that  the  initial  cost  of  such  a  con¬ 
trol  unit,  in  terms  of  hardware,  may  be  very  large.  Studies  of  control  require¬ 
ments  of  a  number  of  elementary  functions  have  tdso  Indicated  that  only  a  small 
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percentage  of  the  total  intersections  of  such  a  matrix  are  enabled  for  a  given 
problem.  Hence  a  large  portion  of  the  V  hardware  may  stand  idle.  Further, 
the  speed  of  switching  from  one  program  to  another  during  a  given  problem, 
even  if  it  is  high  in  terms  of  mechanical  change,  may  be  considerably  slower 
than  could  be  done  electronically.  Finally,  the  required  large  matrices  of 
switching  circuits  may  slow  down  the  operations  of  the  control  unit  during 
executing  a  particular  wired  program. 

Another  course  of  action  involves  building  M  branching  logic  units  and 
M  command  matrices  to  correspond  with  the  M  electronically  switchable  wired 
programs  which  are  to  be  associated  with  the  given  control  unit.  Each  of  the 
matrices  would  contain  only  the  intersections  required  by  the  program  mech¬ 
anized  and  switching  from  one  wired  program  to  another  can  be  accomplished 
at  electronic  speeds,  A  mechanical  structure  change  of  the  control,  however, 
will  be  time  consuming  as  new  branching  logic  units  and  command  matrices 
need  to  be  wired. 

On  the  basis  of  the  large  hardware  requirements  and  low  inventory 
utilization,  the  proposed  control  unit  will  use  the  second  approach,  i.  e. ,  a 
separate  branching  logic  unit  and  command  matrix  for  each  of  the  electronically 
switchable  wired  programs.  Use  of  general  matrices  in  some  of  the  control  units 
in  the  future  is,  however,  by  no  means  ruled  out. 

Since  the  control  functions  of  control  units  on  all  levels  of  control  are 
the  same,  each  such  control  unit  may  be  mechanized  by  using  the  same  basic 
model  of  the  control  unit.  The  structure  and  logical  properties  of  such  a 
model  are  discussed  in  the  following  section. 

1.  2.  5  A  Model  of  the  Control  Unit 

In  this  section  a  model  of  a  control  unit  with  electronically  switchable 
wired  programs  will  be  described.  This  control  unit  may  be  used  in  all  levels 
of  the  control  hierarchy.  Let  V(l)  be  the  i  "  special  purpose  structure  of  V, 
then  cud,  j)  designates  the  j  control  unit  of  Vd).  When  explicit  reference 
to  Vd)  has  been  made,  however,  it  is  sufficient  to  designate  CUd,J)  by  CU(j). 
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«  This  simplified  notation  is  used  in  the  sequel,  i.  e.  ,  the  index  i  is  not  used. 

The  logical  design  of  the  proposed  control  unit  is  based  on  asynchro¬ 
nous  design  philisophy.  There  are  several  versions  of  the  latter.  Totally 

*  11 

asynchronous  design,  as  formalized  by  Muller  and  Bartky,  requires  that 

the  correct  completion  of  a  control  operation  is  verified  before  the  next  con¬ 
trol  operation  is  permitted  to  start.  For  example,  completion  signals  are 
generated  at  every  memory  element  of  a  register  to  indicate  that  the  outputs 
of  a  particular  memory  element  correspond  to  its  inputs. 

A  less  rigorous  asynchronous  design  method  requires  positive  verifi¬ 
cation  of  completion  of  the  control  operation  only  at  a  few  points  of  the  circuitry. 
Finally,  completion  signals  may  be  generated  without  actual  verification  of  the 
correct  completion  of  the  control  operation.  Such  completion  signals  are  based 
on  elapsing  of  a  time  interval  which  is  known  to  be  sufficient  for  correct  com¬ 
pletion  of  the  control  operation.  A  relatively  complete  bibiliograpy  on  asyn- 
*  chronous  design  is  given  in  Reference  12. 

Totally  asynchronous  circuits  permit  very  fast  operation  (e.g. ,  as 
^  soon  as  all  the  circuits  involved  have  responded  to  a  control  signal  and  the 

pro|)er  completion  signals  have  been  formed,  the  state  of  the  control  may  be 
changed  to  generate  a  new  control  signal)  and  their  operation  is  reliable,  but 
hardware  requirements  of  the  completion  signal  circuits  are  rather  large  and 
thus  a  major  portion  of  the  V  hardware  would  not  be  available  for  actual  com  - 
puting  operations. 

The  next  level  of  asynchronism  (not  all  of  the  circuit  points  are  tested 
for  actual  response  to  the  control  signal)  permits  fast  operation,  but.  the  cor¬ 
rect  response  of  all  the  circuitry  to  the  control  signal  is  no  longer  guaranteed. 
Hardware  requirements  have  been  reduced  but  may  still  be  considerable.  The 
last  discussed  level  of  asynchronism  (simulation  of  the  operation  times  of  cir- 
^  cults  without  actusd  test  for  any  response  to  the  control  signal)  offers  no  indica¬ 

tion  of  correct  operation  of  the  circuits  and  the  time  elapsed  before  the  comple¬ 
tion  signal  is  generated  must  allow  for  the  worst  case  variations  of  the  response. 
"  times  of  the  circuits  involved  in  the  control  operation.  Hardware  requirements, 

however,  are  small. 
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The  proposed  control  unit  will  utilize  the  last  two  types  of  asynchronous 
design.  The  totally  asynchronous  was  rejected  on  the  basis  of  large  hardware 
requirements.  The  elementary  commands  will  be  partitioned  into  classes  such 
that  the  operation  times  of  the  elementary  commands  in  a  given  class  are  al¬ 
most  equal.  For  each  such  class,  a  single  completion  signal  is  generated  after 
a  time  interval  sufficient  for  completion  of  commands  in  the  class  has  elapsed. 
Completion  signals  for  compound  commands  are  generated  by  the  corresponding 
lower  level  control  units  as  soon  as  the  execution  of  the  command  has  been  com¬ 
pleted.  Elementary  commands  which  perform  comparison  or  test  operations 
will  have  at  least  two  completion  signals,  that  is,  the  completion  signals  are 
used  to  indicate  the  result  of  the  compare  or  test  operation. 

Figure  1.  7  depicts  the  block  diagram  of  the  proposed  control  unit,  CU(j). 
The  major  units  of  CU(j)  are;  a  control  register,  CR(j);  a  sequence  generator, 
SG(j);  a  subsequence  generator,  SSG(j);  command  matrices,  CM(j,m);  a  com¬ 
pletion  signal  generator,  CG(j) ;  and  a  timing  chain,  TC(i).  Figure  1.  8  depicts 
these  units  in  more  detail.  The  purpose  and  structure  of  each  unit,  as  well  as 
their  interaction,  will  be  discussed  in  detail  below. 

The  control  register,  CR(j),  is  a  collection  of  k  memory  elements, 

CR(j,  k),  which  are  used  to  store  information  concerning  the  particular  compu¬ 
tation  in  progress,  results  of  tests  and  comparisons  which  may  be  utilized  later 
in  the  computations,  etc.  The  outputs  of  CR(j,  k)  are  designated  as  cr(j,k;l)  and 
cr(j,k;0),  to  represent  the  "1"  and  the  "0"  states  of  CR(j,k),  respectively.  The 
outputs  of  CR(j)  may  be  used  in  the  branching  logic  of  the  sequence  generator 
or  in  the  control  matrices. 

The  sequence  generator,  SG(j),  contains  a  sequence  counter,  SC(j);  a 
sequence  register,  SR(j);  and  M  branching  logic  units,  BL(j,m),  corresponding 
to  the  M  different  wired  programs  associated  with  CU(j),  The  purpose  of  the 
sequence  generator  is  similar  to  the  program  counter  in  a  general  purpose 
computer,  i.e.,  to  permit  execution  of  the  different  computational  steps  in  a 
predetermined  sequence.  The  sequence  counter,  SC(j),  generates  as  many  as 
2”  sequence  steps,  corresponding  to  its  n  memory  elements.  The  current  state 
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of  SC(j)  is  stored  in  the  sequence  register,  SR(j),  and  the  sequence  counter 
advanced  to  its  next  state  as  prescribed  by  the  branching  logic  of  the  particu¬ 
lar  program  being  executed.  In  the  case  where  the  next  state  depends  on 
the  result  of  a  compare  or  test  operation,  the  next  state  is  based  on  the  as¬ 
sumption  that  the  test  or  comparison  failed  (was  "false").  If  the  test  is 
actually  "true",  SC(j)  is  set  into  the  required  state  as  soon  as  the  result  is 
known.  This  type  of  operation  is  successful  if  the  results  of  comparing  and 
test  operations  which  have  the  higher  probability  of  occurring  are  labelled 
as  "false". 

The  memory  elements  of  the  sequence  counter,  SC(j),  are  denoted  as 

SC(j,  1)  through  SC(j,  n),  corresponding  to  n  memory  elements;  their  outputs 

th 

are  denoted  by  SC(j,k;l)  and  SC(j,  k;0),  for  the  k  memory  element.  The 
elements  of  the  sequence  register,  SR(j),  are  denoted  by  SR(j,  1)  through 
SR(j,  n).  The  outputs  of  the  k*^  memory  element  are  denoted  by  sr(j,k;l) 
and  sc(j,k;0). 

The  outputs  of  SRlj)  are  combined  in  a  decoder,  SD(j),  into  2*'  sequence 
states,  s(j,  1)  through  s(j,  2*^).  Only  those  sequence  states  which  are  required 
for  a  particular  wired  program  are  generated. 

The  M  branching  logic  units,  SL(j,m),  use  information  from  the  sequence 

decoder,  control  register,  and  completion  signals  to  advance  the  sequence 

th 

counter  into  its  next  state.  The  outputs  of  the  i  branching  logic  unit  are  the 

i  set,  reset,  and  trigger  signals  for  the  R-S -T  type  memory  elements  of 

the  sequence  counter,  i.e. ,  s(l) -SC(j, k),  r(i) -SC(j, k),  and  t(i) -SC(j,  k),  for 
th 

the  k  memory  element  of  SC(j). 

The  subsequence  generator,  SSG(j),  contains  a  subsequence  counter, 
SSC(j);  a  subsequence  register,  SSR(j);  and  logical  circuits,  SSL(j),  for  ad¬ 
vancing  SSC(j)  into  its  next  state.  For  each  state  of  the  sequence  generator, 
the  subsequence  generator  may  be  advanced  through  p  successive  states,  cor¬ 
responding  to  the  p  memory  elements  in  the  ring  counter  type  structure  of 
SSC(j).  The  reason  for  inclusion  of  SSG(j)  is  to  reduce  the  number  of  states 
required  in  SSG(j)  and,  correspondingly,  to  reduce  the  complexity  of  the  M 
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branching  logic  units.  The  states  of  SSC(j)  occur  always  in  a  fixed  order 
(no  branching)  and  thus  are  used  to  sequence  parts  of  the  wired  program 
which  involve  no  branching.  The  SSC(i)  may  be  reset  at  any  time  to  its  origin 
state.  This  is  done  every  time  the  state  of  the  sequence  generator  is  changed. 

The  memory  elements  of  SSC(j)  and  SSR(j),  as  well  as  their  outputs, 
are  denoted  SSC(j,  k)  and  SSR(j.  k),  and  ssc(j,  k,  1),  ssc(j,k,  0),  asr(j.k,  1),  and 
ssr(j,  k,  o),  respectively. 

th 

The  selection  of  the  m  command  matrix,  CM{j,m),  for  a  wired  pro¬ 
gram,  E(j,  m),  is  accomplished  by  generating  M  complete  sets  of  subsequence 
states,  ss(j,  l;m)  through  ss(j,k;M),  corresponding  to  the  wired  programs. 

The  outputs  of  this  structure  switching  unit  go  to  all  of  the  M  command  matrices. 

The  outputs  s(j,i),  of  the  sequence  generator,  and  the  outputs  ss(j,  k;m), 
of  the  subsequence  generator,  may  further  be  combined  (for  circuit  economy 
reasons)  by  logical  AND  function  into  signals  denoted  as  u(j,i,  k;m). 

The  M  command  matrices,  CM(j,m),  generate  the  elementary  and 
compound  commands  which  are  specified  for  a  given  subsequence  state  of  a 
given  sequence  state.  As  was  mentioned  before,  the  set  of  elementary  com¬ 
mands  may  be  partitioned  into  classes  consisting  of  elementary  commands  whose 
operation  times  are  approximately  equal.  A  particular  partition  proposed  here 
is:  parallel  data  transfer  commands,  mx(j,k);  test  and  compare  commands, 
nit(3,k);  memory  read  command  for  each  memory,  me(j,k);  miscellaneous  com¬ 
mands,  mz(j,k)  which  include  setting  and  resetting  of  registers  or  memory  ele¬ 
ments,  and  counting  by  one;  compound  elementary  commands,  (arithmetic  com¬ 
mands)  ma(j,  k);  and  modifiers  of  the  compound  commands,  h(j,k),  which  specify 
whether  the  compound  command  to  be  executed  is  single  precision,  double  pre¬ 
cision,  fixed  point,  floating  point,  etc.  As  the  number  of  different  elementary 
commands  increases,  other  partition  schemes  may  be  required. 

The  commands  generated  for  a  wired  program  are  specified  by  com  - 
bining,  using  logical  AND  functions,  the  required  sequence  state,  s(j,  k),  and 
the  subsequence  state  of  the  given  wired  program,  E(j,m),  ss(j,  l;m). 
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The  completion  signals  for  the  elementary  commands  are  generated 
in  the  completion  signal  generator.  CG(j),  where  proper  time  delay  units  are 
used.  Completion  signals  for  compound  commands  are  generated  in  the  pro¬ 
per  lower  level  control  units.  Completion  signals  are  designated  by  cx(j)  for 
the  class  of  transfer  commands.  cz(j).  and  ce(j.  k)  for  the  miscellaneous  ele- 
mentary  commands  and  for  the  k  memory  read  command;  ca(j.k)  for  the  k 
compound  command;  and  ct(j.  k;r)  for  the  r^^  result  of  ct(j.k).  All  completion 
signals  are  combined  with  the  wired  program  selection  signals,  E(j,  m),  to 
generate  a  complete  set  of  completion  signals  for  each  wired  program  such 
that  selection  of  the  proper  branching  logic  unit  is  done  by  the  appropriate 
set  of  completion  signals. 

Finally,  the  control  of  the  control  unit  itself  is  executed  by  a  timing 
chain,  TC(j),  which  generates  signals  for  transferring  the  contents  of  the 
sequence  and  subsequence  counters  into  respective  registers,  advancing  the 
states  of  the  former,  and  correcting  the  states  in  the  case  where  the  assump¬ 
tion  of  "false"  test  result  was  assumed. 

A  detailed  procedure  for  designing  control  units  for  mechanizing  com¬ 
putation  of  elementary  functions,  as  well  as  a  number  of  examples  of  such 
control  units  are  given  in  Reference  13. 

A  convenient  description  of  the  operation  of  the  control  unit  for  mech¬ 
anizing  a  given  computation  can  be  obtained  by  constructing  a  table  which  shows 
the  "present"  and  the  "next"  states  of  the  control  unit.  A  particular  "present 
state"  entry  specifies  the  states  of  the  sequence  generator,  S(j,k),  subsequence 
generator,  SS(j,k;m),  and  the  control  register,  CR(j,i);  lists  the  commands 
generated,  and  shows  the  states  of  the  sequence  generator,  and  the  control 
register  as  functions  of  the  completion  signal,  i.  e. ,  the  "next  state".  An 
example  of  such  a  description  is  given  in  Figure  1.9. 

1.  2.  6  Description  of  V  Structures 

In  order  to  develop  systematic  procedures  for  designing  V  structures, 
it  is  necessary  to  establish  a  framework  for  systematic  description  of  such 
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structures.  As  a  part  of  this  framework,  the  concepts  of  macro -description 
and  micro -description  of  V  structures  are  formulated. 

1.2.7  Macro -description 

A  macro  -description  of  a  V  structure  will  specify  the  following  im  - 
portant  large-scale  features  of  the  structure: 

1.  Computational  characteristics  in  terms  of  the  compound  commands 
available  to  the  supervisory  control.  For  each  such  command,  the 
upper  and  lower  bounds  and  the  averages  of  their  computation  times, 

%  hardware  used  exclusively,  and  %  hardware  time  shared  are  given. 

2.  The  inventory  involved  in  terms  of  functional  units,  such  as 
control  units,  arithmetic  units,  memory  units  (including  any  input - 
output  units),  registers,  combinatorial,  logic  units,  non -digital  devices, 
etc. 

3.  The  logical  interconnections  between  the  functional  units. 

4.  The  change  from  standard  state  in  terms  of  functional  units. 

5.  The  cost  associated  with  the  change  from  standard  state,  in 
terms  of  time,  hardware,  and  other  significant  parameters  (if  any). 

6.  Computationsil  history  of  the  structure  in  terms  of  problems  in 
which  used,  how  many  times,  how  long,  errors,  etc. 

1.2.8  Micro -description 

A  micro -description  specifies  the  detailed  design  of  the  V  structure 
as  follows: 

1.  The  elementary  commands  and  the  elementary  compound  commands 
are  listed. 

2.  Computational  algorithms  for  each  of  the  compound  commands 
available  to  the  supervisory  control  are  expressed  in  mathematical 
terms. 

3.  Designs  of  Control  unit  structures  for  executing  compound  commands 
are  given. 
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4.  Computation  times  of  the  compound  commands  are  given  in  terms 
of  lower  order  execution  times,  until  all  times  are  in  terms  of  a  set 
of  elementary  operation  times. 

5.  Functional  units  are  described  in  terms  of  basic  amplifier  modules 
and  logic  (combinatorial)  modules.  The  extent  of  fan-in  and  fan-out  of 
each  basic  module  is  given.  Other  types  of  modules  may  be  defined 
as  need  arises. 

6.  Logical  interconnections  of  the  basic  modules  are  given. 

7.  Locations  of  basic  modules  are  given  referred  to  an  adequate  co¬ 
ordinate  system  (co-ordinates  may  refer  to  frames,  motherboards, 
locations  on  motherboards,  etc.). 

8.  Physical  interconnections  of  basic  modules  are  given  in  an  adequate 
co-ordinate  system  (coordinates  may  refer  to  connectors,  pins;  types 
of  connecting  wires,  lengths  of  connecting  wires  may  be  specified). 

9.  A  change  from  the  standard  state  is  given  in  terms  of  change  in 
the  requirement  for  different  types  of  basic  modules,  change  in  the 
type  of  the  basic  module  at  a  given  location,  and  in  terms  of  change 
of  physical  interconnections  between  basic  modules. 

10.  The  cost  of  a  change  from  the  standard  state  is  given  in  terms  of 
costs  of  change  of  lower  order  units  of  the  V  structure,  until  the  costs 
are  expressed  in  terms  of  a  set  of  elementary  cost  units. 

1.  2.  9  Notation  for  Macro -description 

th 

Let  V(i,  k;  p^;  p^;  Pg;  p^;  p^;  Pg)  represent  the  i  configuration  of  the 
V  inventory,  referred  to  stauidard  state  k.  The  p.  refer  to  different  properties 
of  V(i,k). 

1.  2.  9.1  Computational  characteristics  of  V(i);  property  p^ 

The  computational  characteristics  in  a  macro -description  of 
V(i,  k)  are  expressed  as  a  list  of  the  compound  commands  available  to  the  super 
visory  control.  For  each  of  the  compound  commands  its  computing  times 
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(maximum,  average,  and  minimum)  and  V  hardware  utilization  (exclusive 
and  shared)  are  given. 


V(i;  Pj)  =  |(f(k,  c;q.),  T-f(k,  c;q^)  H-f(k,  c;qj),  P-£(k,  c;qj) 


where 


k  =  1,  ...  K,  i  =  1,  ....  I. 

f(k,  c;q.)  =  k**^  compound  command  of  class  c  available  to  the 
supervisory  control.  The  property  refers  to  the  i  ”  com¬ 
putational  algorithm  of  f(k,  c)  and  to  the  design  of  the  corre¬ 
sponding  control  unit. 

T-f(k,  c;q^)  =  ^ma-f  (k,  c;q.),  Tav-f(k,  c;q^),  Tmi -f(k,  c;q^)^ 
specifies  the  maximum,  average,  and  minimum  computation 
tinries  of  f(k,  c;q^)  respectively. 

I-f(k,  c;q.)  =  ^Ie-f(k,  c;q^),  Is-f(k,  c;qj,)  ^  specifies  the  amount 
of  inventory  utilized  by  f(k,  c;q^)  exclusively  and  shared,  re¬ 
spectively. 


C-f(k,  c;q^)  =  {c^},  specifies  the  classes  of  compound  commands 
which  may  be  executed  simultaneously  with  f(k,  c;q^). 

1.2.  9.  2  Description  of  inventory;  property  p^ 


The  inventory  of  V(i,  k)  is  described  in  terms  of  functional 
units;  control  units,  arithmetic  units,  registers,  combinatorial  logic  units, 
memories,  and  non -digital  devices.  In  the  following  notation  the  indices, 
i,  k  are  omitted  for  simplicity. 


V(i,k;q2)  -  ( 


CU(j;Pj^) 

NU(j;Pj^)j 


t 

) 


CU(j;Pj^)  :  j  ”  control  unit  with  property  Pj^.  Property  Pj^  specifies 
the  compound  commands  mechanized  by  CU(j),  the  logical  design 
of  the  control  unit,,  the  amount  of  hardware  utilized,  and  the  fan -^in 
and  fan-out  distributions. 
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th 

AU(j;Pj^)  :  j  arithmetic  unit  with  property  Pj^.  Property  Pj^ 
specifies  the  arithmetic  operations  performed  by  AU(j),  its 
logical  design,  hardware  requirements,  and  fan-in,  fan-out 
distributions. 

th 

RU(j;Pj^)  :  j  register  unit  with  property  p^^.  Property  Pj^ 
specifies  the  size  of  the  register  in  bits,  its  shifting  or 
counting  capabilities,  hardware  required,  and  fan-in  and 
fan-out  distributions. 

LU(j:Pk)  ‘  combinatoric  logic  unit  with  property  p^, 
where  p^  specifies  the  number  of  parallel  inputs,  nature 
of  the  logical  operations  performed,  hardware  required, 
and  fan-in  and  fan-out  distributions. 


MU(j;Pk)  :  j  memory  unit  with  property  p^,  where  p^ 
specifies  the  type  of  memory  (or  input -output  unit),  its 
size,  access  and  cycle  times,  priority  structure,  special 
properties  (such  as  content  addressable  memory,  disc  file). 


.th 


NU(i,  j;  p^)  :  j  non -digital  unit  with  property  p^,  where 
Pk  describes  the  nature  of  the  non -digital  unit. 


1.2.  9.3  Logical  interconnections  between  functional 
units;  property  p^ 


Logical  connections  between  functional  units  specify  the  re¬ 
quired  data  paths  and  control  lines.  Interconnections  are  specified  at  the  "bit" 
level  of  the  data  handling  functional  units  (registers,  etc. )  along  with  the  control 
signal  if  the  particular  data  path  is  gated. 


Let  FU(j;Pk)  represent  a  general  functional  unit.  Inter¬ 
connections  between  functional  units  can  be  specified  as; 

VdiPg)  =  I  FU(j;r-s)  -  FU(k;t-v)  ;  mx(u)  ^ 

where  a  particular  element  of  the  set  states  that  the  bits  r  through  s  of  Fy(j) 
are  connected  or  transferred  to  the  bits  t  through  v  of  FU(k)  under  the  control 
of  the  elementary  transfer  command  mx(u). 
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1.2.  9.  4  Functional  unit  level  change  from  the  standard 
state;  property 

The  standard  state,  V(k),  will  be  considered  as  a  reference 
frame  for  changes  in  V  structure.  The  macro -description  of  a  change  of 
V(i,  k)  from  V(k)  is  given  in  terms  of  functional  units  and  parts  of  functional 
units  of  V(k). 

vdjp^)  =  {(FUd.j)  .  FU(k,m;r-.)+FU(k,m«,u-v)  *..,)} 

which  state  that  FU(i,  j)  is  composed  of  bits  r  through  s  of  FU(k,  m),  bits  u 
through  V  of  FU(k,  m+1),  and  so  on. 

1 .  2 .  9 .  6  Cost  of  change  from  standard  state;  property  p^ 

The  total  cost  of  change  from  V(k)  to  V(i)  is  a  weighted  func¬ 
tion  of  costs  in  time,  hardware,  and  other  pertinent  parameters. 

Similarly  the  cost  of  changing  V(i)  to  V(k)  is  expressed  in 
terms  of  the  costs  of  time,  hardware,  etc. 

V(i;p,)  =  CS(k-*i)  +  CS(i-k) 

0 

CS(k-»i)  is  the  total  cost  of  change  from  V(k)  to  V(i) 

CS(i-*k)  is  total  cost  from  V(i)  to  V(k)  and  both  are  expressed 
in  terms  of  more  detailed  costs; 

CS(k-i)  =  Uj  CST(k-i)  +  u^  CSH(k-i)  +  .  .  . 

CS(i-k)  =  Vj  CST(i-k)  +  CSH(i-k)  +  .  . . 

where  CST(k-i)  is  the  total  cost  of  time,  CSH(k-i)  is  the  total  cost 
in  hardware,  u^'  s  and  v^'  s  are  weighting  factors. 

1 .  2 .  9 , 6  Computationail  history  of  V(i);  p^ 

The  computational  history  of  a  structure  V(i)  includes; 

1)  the  total  time  V  has  been  used  in  this  structure.  Various  breakdowns  of 
the  time  into  computation  time,  restructuring  time,  and  troubleshooting  time; 

2)  comments  as  to  the  efficiency  of  the  structure;  3)  recommendations  for 
change;  and  4)  list  of  users., 
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1.  2.  10  Notation  for  Micro -Description 


A  micro -description  of  V(i)  expresses  the  properties  defined  in  the 
macro -description  in  terms  of  more  elementary  units. 

'  1.  2.  10.  1  Computational  characteristics,  property  p^ 

f(k.  c;q.)  =  (|f(j,  |ma(j,  k)j* .  |mi(j,  »}). 

where  ma(j,  c;qj^)  is  an  elementary  compound  command,  and  , 
mi(j,  c;q^)  represents  elementary  commands  of  the  type 
mx(j.k),  mz(j,k),  mt(j,k).  andme(j,k). 

Tma-f(k,  cjq.)  =g  (  Tma-f(j.  c;qj^).  Tma-ma(j,k),  Tma-mi(j,k)^ 
where  Tma-ma(j,  k)  and  Tma-mi(j,k)  are  maximum  execution 
times  of  the  compound  elementary  and  elementary  commsuids, 
respectively,  and 

Tma-ma(j,k)  =  g(Tma-mi(j,  k)  ^ 

Tma-mi(j,k)  =  g(  Tma-FU(j,  Pj^)  ^ 

where  Tma-FU(j;Pj^)  is  the  maximum  operation  time  of  the 
functional  unit  FU(j;Pj^). 


Tma-FU(j;Pj^)  =  g  (  " 


Tma-AM(j,k;p  ) 


m 


where  Tma-AM(j,  k;p  )  and  Tma-LM(j,  k;p  )  stand  for  maximum 


•{ 


Tma-LM(i,j;p  ) 
j»j'm 


»}) 


m 


operation  times  of  amplifier,  and  logic  basic  modules,  respectively. 

Tav-f(k,  c;qj)  and  Tmi-f(k,  c;q^)  are  expressed  in  the  same  manner 
as  given  for  Tma-f(k,  c;q^). 

Ie-f(k,  c;q.)  =  (Ne-AM,  Ne-LM) 

where  Ne-AM  and  Ne-LM  are  the  numbers  of  different  basic 
modules  used  exclusively  by  the  compound  command  f(k,  c;q^). 

A  listing  of  these  may  also  be  presented. 

Is-f(k,  c;q^)  =  (Ns-AM,  Ns-LM) 

where  Ns  terms  are  defined  as  numbers  of  different  basic  modules 
shared  with  other  compound  commands.  A  listing  of  the  modules 
and  compound  commands  which  they  share  may  be  included  in  this 
description. 
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1.  2. 10.  2  Description  of  inventory;  property 

A  general  functional  unit.  FU(j:p^).  is  described  in  terms 
of  circuit  modules: 

FU(j;p.)  =  ({AM(F.j.k;r^)}.  ■[LM(F.j.k;r^)}) 

AM(F,  j,k;r^)  designates  the  k  basic  amplifier  module  belonging 
to  the  FU(j).  Property  r^  specifies  the  type  of  the  amplifier  module, 
its  fan-in  aind  fan-out  capabilities,  power  requirements,  and  speed; 

tJl 

LM(F,  j,  k;r^)  is  the  m  basic  logical  module  of  FU(j).  The 
property  r^  specifies  the  logical  functions. 

1.2.10,3  Logical  and  Physical  Interconnections: 
property  p^ 

Logical  and  physical  interconnections  of  basic  modules  are 
given  in  list  form.  The  inputs  and  outputs  of  basic  modules  are  designated 
by  letters  and  numbers.  The  inputs  are  designated  by  letter  "a",  outputs  by 
letter  "z".  Thus,  for  general  basic  module  BM(F,  j,  k) 

(z2-BM(F.j.  k)  )  -  (a5-BM(F,  j,m)  ) 

denotes  a  logical  Interconnection  between  the  output  z2  of  BM(F,  J,  k)  and  input 
aS  of  the  BM(F,  j,  m) . 

•In  order  to  introduce  physical  wiring,  notation  for  connectors, 
wire  types,  and  wire  length  has  to  be  introduced.  Connectors  can  be  specified 
as  CO(i,  j»k;s^),  where  three  coordinates  are  allowed  for  specifying  the  con¬ 
nector,  and  property  s^  specifies  the  type  of  the  connector.  Wiring  can  be 
denoted  as  WKu^,  v^),  where  property  u^  specifies  the  kind  of  wiring  (e.  g. , 
printed  wire,  twisted  pair,  coax,  etc. ,  and  property  the  length  of  the  wire). 

A  complete  notation  for  combined  logical  and  physical  inter¬ 
connection  is  then  as  follows: 

(  z2-BM(F.  j,k)  )  -  (  a5-BM(F.  j))  :  WKu^.v.)  -  CO(i.j,k,s^)  - 
WI(u^,  Vj)  -  ...  -  WI(Uj^,Vj)  -  CO(l,  j,k;s^)  -*  WI(Uj,  v^) 
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where  the  physical  connection  is  given  by  a  chain  of  wires  and  connectors. 

The  coordinate  system  used  to  specify  the  locations  of 
basic  modules  depends  on  the  physical  construction  scheme  to  be  used.  A 
preliminary  assumption  of  the  system,  however,  is  used  to  indicate  the 
nature  of  location  coordinates.  Assuming  that  the  system  consists  of 
frames,  motherboards  and  modules,  the  location  of  a  module  is  given  by 
frame  number,  two  coordinates  for  the  board  in  a  frame,  and  two  coordi¬ 
nates  for  the  module  location  on  the  board,  then 

L-BM(F,j,k)  =  (f;g,  h;p,  q). 

where  f  is  the  frame  number,  g,  h  the  board  coordinates,  and  p,  q  the  module 
position  coordinates. 

Assignment  of  locations  to  basic  modules  is  not  a  trivial 
matter.  At  high  circuit  speeds,  distances  between  modules  become  significant 
both  in  contributing  to  propagation  delay  and  in  capacitive  loading  of  the  out¬ 
puts.  Thus  policies  such  as  minimizing  the  total  wire  length  of  interconnec  ¬ 
tions,  or  minimizing  wire  lengths  of  the  interconnections  with  most  activity 
may  be  used.  Since,  most  likely,  more  sophisticated  wiring  (such  as  coax 
cables)  is  required  for  interconnections  exceeding  a  certain  length,  the  first 
policy  may  also  reduce  the  wiring  costs, 

1.2.10.4  Change  from  standard  state;  property  p^ 

.  The  change  from  standard  state  may  be  described  by  listing 
all  locations  where  basic  modules  are  to  be  replaced  by  different  basic  modules, 
and  by  listing  all  interconnections  to  be  removed  and  all  new  interconnections 
to  be  established. 

1.2.10.5  Cost  of  change;  property  p^ 

The  cost  terms  given  in  the  macro -description  are  expressed 
as  functions  of  more  elementary  cost  terms:  cost  of  design  of  the  new  structure, 
cost  of  wiring,  cost  of  debugging,  costs  of  different  basic  modules  and  con¬ 
nectors,  costs  of  preparing  printed  circuit  boards  (if  required),  and  so  on. 

Exact  nature  of  the  cost  function  can  be  determined  only  after  a  physical  con¬ 
struction  scheme  and  a  mechanical  restructuring  procedure  have  been  adopted. 
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1.  2.  11  Change  in  V  Structures 

As  stated  previously,  changes  in  V  structure  may  be  effected  on 
several  levels:  electronically,  electro -mechanically,  and  mechanically. 

Electronic  structure  change  implies  enabling  sets  of  gates,  mainly 
in  the  control  hierarchy,  which  permit  a  different  computation  to  be  car¬ 
ried  out.  Thus  electronic  switching  from  one  structure  to  another  is  under 
the  control  of  a  signal,  generated  at  some  level  of  the  control  hierarchy, 
which  can  be  classified  as  a  compound  command.  Electronic  structure 
change  is  thus  a  structure  change  in  a  trivial  sense  which,  in  principle, 
is  not  different  from  selecting  one  command  or  another  in  a  conventional 
machine. 

Electromechanical  structure  change  may  take  the  form  of  closing 
relay  contacts  (in  which  case,  except  for  the  switching  time,  in  concept  it 
is  really  equivalent  to  electronic  switching,  since  construction  of  the  struc  - 
tures  which  are  switched  had  been  done  beforehand  and  the  control  signal 
which  performs  the  switching,  is  an  equivalent  compound  command). 

It  is  the  capability  for  mechanical  restructuring  such  as  physical 
altering  of  the  logical  properties  of  basic  modules,  altering  their  inter -con¬ 
nections,  and  changing  their  physical  locations,  which  permits  two  V  config¬ 
urations  to  be  completely  unrelated  and  yet  share,  to  a  large  extent,  the  same 
elementary  hardware.  In  order  to  be  significant,  such  structure  changes 
must  be  effected  with  a  minimum  of  V  downtime  (it  is  important  to  note,  that 
during  restructuring  of  V,  F  still  may  do  useful  work),  both  in  actual  re¬ 
structuring  and  in  debugging.  This  requirement  implies  maximum  possible 
restructuring  and  debugging  off -machine,  easy  changes  in  the  logical  properties 
of  the  circuit  modules,  and  systematic  procedure  for  changing  of  their  inter¬ 
connections.  (In  the  extreme  one  can  visualize  completely  built  and  debugged 
wiring  frames  for  the  new  structure  into  which  the  circuit  modules,  with  pro¬ 
per  changes  in  pluggable  cap  connections,  are  inserted  and  then  the  new  frames 
replace  the  old  ones  in  the  F-t-V  system.) 
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It  is  clear  that  the  physical  construction  scheme  of  V  largely  deter¬ 
mines  the  cost  of  mechanical  restructuring  both  in  time  and  in  hardware.  A 
clear  definition  of  such  a  construction  scheme  has  to  be  given  before  realistic 
cost  figures  (which  are  of  great  Importance  in  allocating  computations  to  V), 
can  be  obtained. 

Further,  in  reference  to  allocation  of  computations  in  the  F+V  system, 
the  extent  of  mechanical  structure  change  associated  with  a  problem  brings 
in  the  problem  of  scheduling  computations  in  F+V,  such  that  the  total  cost  of 
mechanical  structure  changes  for  a  batch  of  problems  is  minimized. 

1.  2.  12  The  Supervisory  Control  Unit,  SC 

The  principal  function  of  the  supervisory  control  unit  is  to  coordinate 
the  computational  activity  of  the  F  and  the  V  parts  of  the  F+V  system.  In  par¬ 
ticular,  the  SC  must  enforce  the  precedence  requirements  of  computations 
according  to  the  computational  structure  of  the  cpmputational  task  on  hand.  In 
order  to  execute  its  functions,  SC  can  halt  and  release  the  computations  in  P 
and  in  V,  order  information  transfers  between  F,  V,  and  other  units  of  the  sys  - 
tern,  and  order  structure  changes  in  V. 

The  control  sequence  in  SC  may  be  a  wired  program,  a  stored  program 
in  SC  proper,  or  a  stored  program  in  F.  Communication  with  F  and  with  the 
stored  master  program  may  be  in  the  form  of  responses  to  interrogation  com¬ 
mands  in  the  stored  prograni,  e.g,,  at  certain  points  the  stored  program  will 
enter  into  interrogation  loops  from  which  it  will  be  released  only  after  a  certain 
condition  in  SC  is  satisfied.  Completion  of  operations  in  F  are  reported  to  the 
SC  by  special  commands. 

Communication  with  V  may  be  on  more  direct  basis  as  the  SC  may  be 
viewed  as  the  control  unit  on  the  highest  control  level.  As  such,  SC  can  be 
built  in  the  same  manner  as  the  rest  of  the  control  units  in  V,  and  all  compu¬ 
tations  in  V  are  available  to  It  as  compound  commands.  Completion  signals 
are  then  received  in  the  manner  outlined  for  the  control  model  in  Section  1.  2,  4. 

Information  transfers  between  F  and  V  are  ordered  by  SC  for  large 
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blocks  of  data  under  the  control  of  its  program  (wired  or  stored)  or  auto¬ 
matically.  The  latter  case  is  common  for  executing  a  computation  on  an 
argument  supplied  by  F  and  transferring  the  result  back  to  F.  In  data  trans  ¬ 
fer  operations,  SC  will  perform  similarly  to  the  control  unit  of  an  F  data 
channel,  except  that  it  will  have  the  highest  priority.  In  this  mauiner,  it 
may  be  possible,  under  certain  conditions  in  F,  to  effect  data  transfers  be¬ 
tween  F  and  V  without  actually  interrupting  the  computations  in  the  former. 
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1-3 


Two-Dimensional  Heat  Conduction  Problem 


A.  Problem  Statement 

A  mathematical  statement  of  the  linear  two-dimensional  diffusion 
equation  on  a  rectwgular  plate  with  Dirichlet  conditions. 

B.  The  Alternating -Direction  Algorithm 

A  statement  of  the  A-D  algorithm  with  a  derivation  of  the  procedure 
for  its  use  in  computation. 

C.  Numerical  Stability  and  Convergence 

Development  of  the  truncation  errors  involved  in  the  A  -D  algorithm 
and  an  analysis  leading  to  conditions  under  which  the  linearity  requirements 
of  a  problem  may  be  removed. 

D.  Computational  Experiments 

Comparison  is  made  between  the  results  of  the  nonlinear  extension 
of  the  A-D  algorithm  and  results  of  another  computational  procedure  for  the 
solution  of  a  nonlinear  diffusion  equation. 

E.  Conclusions 

An  extension  to  the  AD  algorithm  was  developed.  This  development 
allows  an  alteration  in  At  at  each  time  step.  More  than  this,  it  prescribes  a 
computational  procedure  for  automatically  selecting  the  largest  possible  time 
step  consistent  with  numerical  stability.  This  procedure  was  extended  to  allow 
the  Inclusion  of  a  nonlinear  diffusion  coefficient  in  the  two-dimensional  diffusion 
equation.  Without  a  statement  of  this  stability  bound  and  the  consequent  ability 
to  adjust  At  St  each  time  step,  the  AD  algorithm  would  lose  much  of  its  effective  ¬ 
ness  as  a  method  for  solving  the  nonlinear  diffusion  equation. 

The  method  was  applied  to  a  nonlinear  two-dimensional  diffusion  equation 
satisfied  on  a  square  cross-section.  An  iterative  procedure  using  this  method 
yielded  a  transient  solution  which  was  within  3  percent  of  the  "true"  solution. 

In  arriving  at  this  solution,  the  time  steps  were  automatically  increased  consis¬ 
tent  with  numerical  stability. 
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1-4  Pattern  Recognltlona 


INTRODUCTION 


1.  4.1  Pattern  Recognition  by  Maohlne 

We  have  entered  an  era  of  tranaitipn  in  which  electronic  computing 
machines  are  talcing  over  many  tasks  formerly  done  exclusively  by  humans. 
When  it  is  known  beforehand  that  data  is  to  be  maohlne  processed  it  is  usually 
simplest  and  most  economical  to  record  the  data  originally  in  machine  fbmu 
For  all  past  history  and  for  Information  whose  processing  cannot  be  well  pre-^ 
dieted  it  is  not  reasonable  to  expect  such  preparation.  As  a  result  one  often 
finds  situations  in  which  data  presented  in.  a  form  suitable  for  humans  must 
be  processed  by  a  machine  instead.  At  present,  with  few  exceptions,  this 
data  must  be  "coded",  1.  e. ,  converted  to  a  form  acceptable  by  the  maohlne 
in  question  before  it  oan  be  processed.  Usually  this  coding  process  involves 
operations  such  as  punching  boles  in  cards  or  tape  and  must  be  accomplished 
by  humans. 

In  many  oases,  the  amount  of  labor  necessary  for  coding  data  is  very 
small  compared  to  the  labor  saved  by  using  a  machine  to  process  the  data. 
Sometimea,  however,  the  time  and  ejqpense  of  coding  may  negate  the  advan¬ 
tages  of  using  a  machine  in  the  first  place.  Language  translation,  processing 
of  bank  checks,  and  sorting  of  mall  are  but  a  few  of  the  many  tasks  in  which 
the  necessity  of  manual  coding  wmild  enviously  render  the  use  of  machines 
unfeasible.  It  is  evident  that  an  automatio  coder,  capable  of  rapidly  and 
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accurately  reducing  printed,  written,  or  spoken  data  to  a  form  amenable  to  I 

I 

machine  processing  would  greatly  extend  the  range  of  tasks  to  which  oompu-  I 

^  i 

ters  can  be  profitably  applied.  i 

I 

Hence  it  is  not  at  all  surprising  that  the  problem  of  automatio  recog¬ 
nition  of  visual  patterns  by  machine  is  receiving  a  great  deal  of  attention  Iqr 
research  workers  at  present.  Since  numerical  data  and  most  languages  are 
e^ressed  by  means  of  a  small  number  of  discrete  symbols  or  ''characters", 
it  is  clear  that  automatic  coding  can  be  achieved  only  by  a  machine  capable  of 
recognizing  such  characters.  ''Recognizing''  in  this  context  means  the  ability 
to  respond  to  each  character  with  a  unique  signal.  In  other  words,  a  machine 
designed  to  recognize  K  distinct  characters  P,  ,  ,  Pj  ,  . . . ,  P,,  must  satis-  ^ 

fy  the  following  requirements: 

(a)  The  machine  must  have  K  distinct  output  signals  0, ,  Oj^, ...,  0,corres- 

ponding  to  the  characters  P, ,  P,^, . . . ,  P^  respectively,  and  a  "rejeot" 
output  such  that 

(b)  Whenever  the  input  to  the  machine  is  Pj^  (1^^),  the  output  is  0^ . 

(c)  Whenever  the  input  cannot  be  uniquely  associated  with  one  of  the 

characters  P,  ,  P^. . P^,  the  output  is  « 

In  a  character -recognition  problem,  one  may  begin  by  considering  a 
set  of  ideal  forms  or  "prototype"  patterns  corresponding  to  the  characters  to 
be  recognized.  Unfortunately,  in  actual  situations  the  characters  to  be  recog¬ 
nized  may  vary  considerably  from  the  prototype  patterns.  These  variations 
arise  in  general  from  two  sources:  First,  one  may  encounter  systematlo  || 
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variatioas  such  aa  the  use  of  type  fonts  which  vaiy  from  the  prototype  font 
as  misalignment  of  the  characters  and  the  reading  element.  Second,  there 
are  unavoidable  random  variations  or  noise  which  arise  from  such  sources 
as  over-inking  of  type,  over-used  typewriter  ribbons,  ink  smears,  etc. 

Noise  may  also  be  introduced  by  imperfections  In  the  reoognltlon  apparatus 
itself.  A  oharaoter-recognltion  system  must  be  designed  to  recognize  not 
only  the  prototype  patterns,  but  also  all  distortions  and  variations  of  them 
likely  to  be  encountered.  Obviously  the  optimal  character  reoognltlon 
scheme  for  a  given  application  will  depend  upon  the  nature  of  the  prototype 
patterns,  the  actual  characters  to  be  recognized,  and  the  level  of  accuracy 
which  must  be  maintained.  In  Section  l.S,2,  a  number  of  character 
recognition  schemes  suitable  for  mechanization  are  outlined.  These  range 
from  conceptually  trivial  schemes  which  suffice  when  the  unknown  characters 
are  sufficiently  similar  to  their  prototypes,  to  more  sophistloated  and  general 
methods  which  must  be  employed  when  dealing  with,  for  example,  sloppy 
handwritten  characters. 

1.4.2  The  Concept  of  "Sianifloant  Features'* 

Character  reoognltlon  schemes  all  make  use  (Implicitly,  at  least)  of 
certain  slgnifloant  features  which  serve  to  distinguish  the  various  characters 
from  one  another.  In  general,  these  significant  features  occur  in  assooiatioa 
with  irrelevant  and  redundant  features  which  do  not  aid  the  recognition  process, 
and  in  fact,  may  hinder  it.  As  a  result  many  of  the  recognition  schemes  which 
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have  been  proposed  are  grossly  ineffioient  because  this  useless  information 
is  processed  together  with  the  significant  information  instead  of  being  elim¬ 
inated  or  ignored.  A  systematio  procedure  for  finding  the  most  significant 
differentiating  features  of  a  set  of  characters  would  therefore  be  a  very 
useful  aid  in  increasing  the  efficiency  of  pattern  recognition  schemes.  As 
will  be  shown  later,  such  a  procedure  might  also  be  useful  in  a  self-adaptive 
character  recognition  system  (i.  e. ,  a  recognition  system  which  automatioally 
designs  or  organizes  Itself).  A  successful  system  of  this  type  would  be  ah 
interesting  advance  in  the  field  of  "artifioial  intelligenoe".  In  this  report,  a 
quantitative  measure  of  the  "significance"  of  a  feature  is  proposed  apd  tested 
by  a  series  of  experiments  on  the  IBM  7090  digital  computer  at  Western  Data 
Ihrooesslng  Center.  The  methods  introchiced  in  this  report  are  based  upon 
fundamental  concepts  of  statistlos  and  information  theory  and  introduoe  no 
restrictive  assumptions  of  an  environment  in  which  fiiere  is  no  noise  and 
dlstortioa. 

In  order  to  clarify  the  notion  of  "signlfioanoe"  of  a  feature,  a  simide 
example  will  now  be  discussed.  Consider  the  idealized  set  of  patterns  shown 
in  Figure  1,10,  Tliese  patterns  are  each  composed  of  9  elementary  areas  or 
cells,  numbered  as  in  Fig.  1 . iOa.  Let  us  restrict  our  attention  to  the  9  features 
9)  defined  as  follows:  a  pattern  will  be  said  to  possess 
feature  F^'  if  and  only  if  cell  Cj  is  black  for  that  pattern.  Ihus  P,  and 
possess  F,  ,  whereas  P^  and  p,  do  not. 


FIGURE  1.10 

(a-)  P,  P,  P,  R, 
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We  observe  that  certain  cells  (2,4)  are  always  white  no  matter  which  pattern 
is  considered.  Similarly  some  cells  are  always  black  (5, 8).  llierefore, 
these  cells  cannot  serve  to  differentiate  between  any  of  the  patterns.  In 
other  words«  the  features  ,  F^,  and  Fj  are  completely  insignificant 

with  respect  to  recognition  of  the  patterns  presented  here. 

On  the  other  hand,  observation  of  other  cells  may  yield  a  great  deal 
of  information  concerning  the  identify  of  the  pattern.  For  example,  if  a 
pattern  possesses  the  feature  F,  ,  then  we  can  conclude  that  It  is  either  P, 
or  Pj ,  and  that  it  cannot  be  P,.  or  P.^  .  If  a  pattern  has  feature  •  It  must 
be  Pf ;  if  it  has  F^ ,  it  must  be  Pj  ,  etc. 

Assuming  that  the  patterns  occur  with  equal  frequencies,  we  can 
attempt  to  rank  the  properties  F, ,  F^', . . .  ,  F,  aooordlng  to  signifioanoe  as 
follows:  ilie  ’’don't  oare"  features  F, ,  F4 ,  Fg ,  F,  must  obviously  be  jdaosd 
at  the  bottom  of  the  list.  The  properties  F ^  ,  F^  ,  F, ,  and  F;  are  of  eq^ual 
significance  since  they  are  each  possessed  by  three  patterns,  and  not  pos* 
sessed  by  one  pattern.  So,  on  the  average,  the  oells  (3, 6, 9, 7)  all  oontaln 
equal  amounts  of  information  oonoernlng  the  pattern  to  be  reoognised,  by 
virtue  of  their  identioal  distribution  of  blaok  and  white.  FfohUy,  we  are  left 
with  F,  .  Cell  I  is  blaok  for  two  patterns  and  white  for  the  other  two.  It  is 
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not  clear  intuitively  whether  to  rank  F,  above  or  below  F3  ,  F^  ,  F.^ ,  and  F,  ; 
however,  It  will  be  demonstrated  later  that  It  Is  proper  to  rank  F,  above  the 
others.  Thus,  our  final  ranking  must  be  as  follows: 

F,  F,  F, 

F,  F,  Fr  fs 

In  this  example  we  have  considered  a  set  of  particularly  simple  proper¬ 
ties,  namely  the  presence  of  black  In  specified  areas  of  the  pattern  field.  For 
the  purposes  of  this  report,  viz. ,  the  development  of  a  criterion  for  the  signifi¬ 
cance  of  properties,  it  will  be  advantageous  to  restrict  our  attention  to  these 
particular  properties.  Moreover,  as  a  more  convenient  alternate  viewpoint, 
we  may  regard  the  ranking  of  the  features  F,  ,  F^  ,  • . . ,  F5  as  a  ranking  of 
the  cells  C. .  C. .  ...,  Cj  according  to  significance.  Thus  we  obtain  the  sig¬ 
nificance  ranking 


for  the  cells  of  the  pattern  matrix.  Instead  of  the  significance  of  general 
properties,  in  other  words,  we  will  study  the  significance  of  cells  or  elemental 
areas  of  the  pattern  field.  In  this  manner,  the  experimental  work  will  be  con¬ 
siderably  simplified,  since  there  will  be  no  need  to  use  complex  property- 
testing  procedures.  It  is  clear,  moreover,  that  no  generality  will  be  lost,  for 
aiqr  techniques  developed  for  determining  the  significance  of  pattern  matrix 
cells  would  also  be  suitable  for  determining  the  significance  of  properties  in 
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generaL 

A  few  salient  points  oonoeming  the  signlfioanoe  ranking  demonstrated 
above  are  worth  noting.  First  of  all,  the  signifieanoe  of  a  feature  is  a  relative 
quantity  which  depends  upon  the  observer's  previous  knowledge  oonoeming  tiie 
identity  or  properties  of  the  xinknown  oharaoter.  The  signlfioanoe  ranking 
given  above  was  based  upon  a  taoit  assumption  of  complete  ignorance  concerning 
the  identify  of  the  unknown  character  except  that  they  were  members  of  a  set  of 
four  basic  patterns.  Suppose,  however,  that  it  is  alreaify  known  that  the  unknown 
pattern  is  either  P|  or  (this  situation  would  result  if,  for  example,  cell  1 
were  examined  and  found  to  be  black).  A  glance  at  Fig.  1 . 10  reveals  that  under 
these  circumstances,  there  would  be  no  point  in  examining  cells  3  and  9,  since 
one  could  predict  with  certainfy  that  these  cells  would  be  white.  In  other  words* 
the  previously  significant  features  F3  and  Fb  have  become  insignificant  as  a 
result  of  the  Information  which  has  been  acquired*.  On  the  other  band,  we 
observe  that  F^  is  now  of  greater  Importance  than  before.  Indeed,  observailoa 
of  cell  6  is  now  sufficient  to  determine  the  identify  of  the  unknown  pattern. 

Another  important  point  is  that  the  signlfioanoe  of  a  feature  is  a  fimotlon 
of  the  frequency  distribution  of  the  patterns.  Suppose,  for  example,  ^at  Py 
is  assumed  to  occur  with  a  probabillfy  of  say  1/100  instead  of  1/4.  Tlien  a 
correspondingly  smaller  significance  must  be  initially  attributed  to  the  feature 
F(  ,  which  serves  to  distlnguisb  Pj  from  the  other  patterns. 


*  Note  that  this  statement  is  completely  anolagous  to  the  one  which  led 
to  placement  of  F^  ,  F^ ,  F( ,  F*  at  the  bottom  of  the  signlfioanoe  list. 
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A  final  oonalderation  oonoerna  the  effects  of  noise  and  distortion  on 


the  signifioanoe  of  features.  Clearly,  if  a  particular  feature  is  subject  to  a 
great  deal  of  noise  which  oocurs  independently  of  the  identity  of  the  unknown 
eharacter,  its  signifioanoe  la  muoh  lower  than  it  would  be  if  the  feature  were 
noise-free  and  oonstant  for  any  given  pattern.  As  an  extreme  example,  one 
may  oonsider  a  feature  so  overwhelmed  by  noise  that  its  presence  or  absence 
is  almost  independent  of  the  Identity  of  the  unknown  pattern.  Obviously  ,  sudh 
a  feature  is  rendered  insignlfioant  Ity  the  noise. 

On  the  other  hand,  if  the  noise  In  a  partioular  cell  is  not  Independent 
of  the  oharaoter  being  examined,  the  noise  may  actually  oonvqr  information 
oonoemlng  the  identity  of  the  unknown  oharaoter,  and  the  signifioanoe  of  the 
oell  mi|^t  fiierefbre  be  inoreased.  For  example,  oonsider  again  the  ideal 
pattern  set  for  Fig.  1.10,  and  assune  that  the  patterns  are  affected  by  noise  in 
oell  2  as  follows:  (a)  When  the  unknown  pattern  is  P, ,  cell  2  appears  Mack 
instead  of  white  wlfii  a  probability  of  1/2.  (h)  U  the  unknown  pattern  is  not 
P, ,  oell  2  alwoys  oppears  white  as  in  fits  ideal  ease.  Suppose  now  that  an 
unknown  pattern  is  examined,  and  it  happens  fiiat  oell  2  is  black.  One  oould 
immediately  eonolude  fiiat  fim  unknown  pattern  is  P, .  This  indicates  fiud 
cell  2  has  a oertata  asKHint  of  significance.  In  the  ideal  case,  on  the  other 
hand,  oall  2  is  ahraya  white  and  henoe  has  no  signifioanoe  whataoever.  There¬ 
fore  we  oanclude  fiiat  noise  of  the  type  eonsidered  increases  die  signifioanoe 
efeeUS. 
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Any  generally  useful  quantitative  measure  of  ttte  signlfleanoe  of 
features  must  deal  with  all  of  die  above  considerations.  Ihe  effects  of  noise, 
variation  of  the  probabilities  of  occurrence  of  the  patterns,  and  partial  knowl¬ 
edge  concerning  the  identity  of  the  unknown  pattern,  must  all  be  properly 
taken  into  account.  A  number  of  schemes  for  determining  significant  proper¬ 
ties  (or  significant  cells,  to  be  more  precise)  have  been  proposed.  Hm 
writer  will  attempt  to  establish  that  the  significance  criterion  to  be  presented 
in  this  report  is  quite  generally  valid  when  all  the  above  factors  are  operative. 
Previously  described  methods  are  of  relatively  restricted  validity  and  may  be 
utilized  successfiilly  to  the  extent  that  the  pattern  set  is  constrained  l^.tfae 
same  restrictions.  Several  such  methods  will  be  discussed  in  Section  1.5.5, 
where  a  review  of  recent  literature  pertaining  to  the  significance 
of  pattern  features  (or  areas)  is  given. 
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1-5  General  Discussion  and  Review  of  Automatic  Pattern  Recognition 


The  purpose  of  this  section  is  to  acquaint  the  reader  with  the  basic  prin¬ 
ciples  of  automatic  character  recognition  devices  and  to  review  recent  artloles 
and  theses  dealing  with  problems  related  to  those  considered  in  this  work. 

1.5.1  The  Fundamental  Components  of  Pattern  Recognition  Systems 

We  will  begin  by  discussing  the  fundamental  components  of  pattern 
recognition  machines.  These  are  indicated  in  the  block  diagram  of  Figure  1.11. 
It  should  be  pointed  out  that  the  block  diagram  is  an  idealisation,  and  that  in 
actual  systems  it  is  not  usually  possible  to  separate  the  machine  into  independent 
units  performing  separate  hmetions  as  indicated.  Nevertheless,  the  breakdown 
according  to  functions  as  shown  is  useful  conceptually  and  will  enable  us  to 
discuss  pattern  rooognltion  in  general  terms.  The  pattern  input  block  shown 
represents  the  pattern  itself,  the  document  on  which  it  an^oars,  the  feeding  and 
alignment  mechanisms,  etc. 

FIGURE  1.11 


In  visual  character  recognition,  the  function  of  the  sensory  unit  is  to  | 
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convert  the  character,  which  is  a  pattern  of  light  and  dark  areas,  into  eleotriosl  { 

or  other  signals  which  can  be  conveniently  processed  by  the  machine.  In  pattern  | 
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recognition  machines,  the  sensory  unit  generally  involves  an  arrangement  of 
lenses,  light  sources,  and  photocells.  In  special  systems  employing  charac¬ 
ters  written  in  magnetic  ink,  the  sensory  device  is  a  magnetic  read  head. 

In  many  practical  systems,  the  sensory  imit  incorporates  some  type  of 
scanning  mechanism.  The  scannii^  mechanism  consists  of  a  small  sensory 
element  which  sequentially  Inspects  various  areas  of  the  pattern.  The  scanning 
path  employed  may  either  be  predetermined,  or  it  may  be  made  to  depend  upon 
the  pattern  being  scanned  (e.  g. ,  the  scanner  might  be  designed  to  follow  ttie 
boundary  of  the  pattern  presented  for  identification).  Most  commonly,  the 
scanner  follows  a  continuous,  predetermined  path  covering  the  entire,  pattem 
area. 

The  quantizing  unit  is  an  essential  part  of  digital  pattern  recognislnf 
machines.  In  this  unit,  the  output  of  the  sensory  devices,  which  is  generally 
a  continuous  function  of  space  and/or  time  coordinates,  is  quantized  for  digital 
processing.  The  most  common  type  of  quantization  is  shown  in  Figure  1.  12. 
The  pattern  is  superimposed  on  a  Cartesian  grid  or  matrix,  and  each  oell  la 
considered  to  be  either  all  black  or  all  white,  according  as  the  amount  of 
black  area  within  the  cell  exceeds  or  falls  to  exceed  a  given  threshold. 

Coding  is  a  process  whereby  the  quantized  character  signal  is  converted 
into  a  series  of  numbers  or  code.  An  extremely  simple  coding  method  for  the 
quantized  character  shown  in  Fig.  1 .  J  2  consists  of  forming  a  matrix  of 

binary  digits  by  placing  a  "1"  in  all  black  cells,  and  a  "0"  in  all  of  the  white 
*  See  page  80. 
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cells.  (Fig.  1.13  ).  Thus  each  pattern  is  represented  as  a  matrix  whose 
elements  are  ones  or  zeros . 

It  is  useful  to  distinguish  between  "relative"  and  "absolute"  codes.* 

A  relative  code  for  a  group  of  patterns  is  a  code  which  contains  Just  enough 
information  so  that  the  representation  for  each  pattern  Is  distinet.  An 
"absolute"  code,  on  the  other  hand,  is  one  which  contains  essentially  all  the 
information  in  the  pattern.  Given  an  absolute  code  representation  for  a 
pattern,  it  is  possible  to  reconstruct  the  original  pattern  with  an  accuracy 
limited  only  by  the  "coarseness"  of  the  quantization  process.  The  binary 
digit  matrix  of  Fig.  1. 13  Is  an  example  of  absolute  coding.  Obviously,  tat 
the  purposes  of  character  recognition,  a  relative  code  is  sufficient.  Indeed, 
since  a  relative  code  usually  yields  shorter  representations  for  each  pattern 
than  an  absolute  code,  use  of  a  relative  code  may  result  in  considerable 
simplification  of  the  succeeding  recognition  unit. 

The  recognition  unit  operates  upon  the  output  of  the  coding  unit  (la 
digital  systems),  and  determines  the  identity  of  the  unknown  pattern.  In 
analog  systems,  the  recognition  unit  generally  processes  the  outyut  of  the 
sensory  block  directly. 

Another  operation  which  is  often  necessary  in  pattern  recognition  sys¬ 
tems  will  be  termed  "pattern  processing".  Tlie  purpose  of  pattern  processing 
is  to  aid  recognition  by  altering  the  unknown  pattern,  (or  its  coded  representa- 

*  Thie  terminology  is  due  to  Kharkevloh  (IS) 
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tion),  BO  that  it  will  more  closely  resemble  the  corresponding  prototype  or 
ideal  pattern.  Pattern  processing  involves  operations  such  as  pattern  center¬ 
ing,  removal  of  spurious  ink  spots,  smoothing  of  Jagged  boundaries,  etc. 
Operations  of  this  type  are  usually  necessary  when  the  patterns  dealt  with  are 
subject  to  significant  amounts  of  distortion  and  noise.  A  "pattern  processing" 
imlt  was  not  included  in  tlie  block  diagram  of  Fig.  1 .  ll  because  this  operation 
cannot  be  localized  at  all.  Pattern  processing  may  occur  in  any  (or  all)  of  the 
bloclu  indicated,  including  the  recognition  imit. 

Sometimes  it  is  possible  to  eliminate  the  need  for  pattern-processing 
by  employing  a  code  which  yields  representations  which  are  invariant  with 
respect  to  the  types  of  distortion  present.  It  is  clear  tliat  if  the  output  of  the 
coding  unit  can  be  made  Independent  of  the  distortion  and  noise,  so. that  it 
depends  only  upon  the  identity  of  the  input  pattern,  there  is  no  need  to  remove 
the  noise  through  separate  pattern  processing.  A  great  saving  in  hardware  may 
be  achieved  in  this  manner. 

1.  5.  2  Automatic  Pattern  Recognition  Techniques 

In  order  to  illustrate  some  of  the  ideas  introduced  above,  the  main 
types  of  automatlo  pattern  recognition  schemes  will  now  be  discussed.  We 
will  begin  with  a  discussion  of  the  template-matching  (or  "masking")  technique, 
which  is  one  of  the  simplest  (but  least  powerhil  in  noise  and  distortion  tolerance) 
of  the  methods  which  have  been  proposed. 
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Aa  the  name  implies,  this  technique  involves  producing  a  "template" 
or  "mask"  for  each  character.  Each  template  is  identical  to  the  assumed 
ideal  form  of  the  corresponding  character.  Recognition  of  an  unknown  oharab* 
ter  is  accomplished  as  follows:  the  imknown  is  compared  with  each  template 
by  simple  superposition,  and  is  subsequentiy.  identified,  as  the  character 
whose  template  gives  the  "best  fit".  More  precisely,  let  us  assume  that  a 
digital  computer  is  being  employed  to  carry  out  the  template>matehing  process. 
The  templates  are  stored  in  memory  in  the  form  of  binary  digit  matrloes  of 
the  type  Illustrated  in  Fig .  1.13.  The  unknown  character  is  also  converted 
to  binary  matrix  form  and  compared  with  each  template  in  turn  by  (t)  counting 
the  number  of  times  I's  occur  in  corresponding  cells  of  the  unknown  and  the 
template,  and  (2)  subtracting  from  this  number  the  number  of  times  a  cell  of 
the  unknown  contains  a  xero  while  the  corresponding  cell  of  the  template 
contains  a  one  (or  vice  versa).  In  this  manner,  a  measure,  of  similarity 
between  the  unknown  character  and  each  prototype  ohsraoter  is  determined. 

In  order  for  this  procedure  to  work  satisfactorily,  it  is  clear  that  the 
unknown  characters  must  not  deviate  to  any  slgnifioant  extent  from  their  tem¬ 
plates.  More  precisely,  for  proper  recognition  an  unknown  character  must 
not  deviate  from  its  corresponding  template  by  more  than  half  of  thd'similarity 
distance"  between  the  template  and  its  closest  neighbor  in  the  template  spaoe. 
Therefore,  the  unknowns  must  be  very  soourately  centered,  jOorreoted'for:tUf, 
and  optioally  (or  otherwise)  corrected  for  variations  in  sise.  Even  when  pro- 


vision  has  been  made  for  accooiplishing  these  diffieult  adjustments,  the 
method  remains  weak  in  the  sense  that  new  templates  usually  must  be  defined 
for  different  fonts.  For  example,  if  an  "H"  appeared  in  the  variant  form  "id'i 
it  would  undoubtedly  be  mlsreoognised  as  an  "A".  It  is  obvioi's  that  this  sys* 
4em  oannot  be  depended  upon  except  for  the  simplest  applioations.  Mors 
elaborate  and  powerful  template-matching  systems  employ  "statlstioal  tem¬ 
plates"  which  are  based  upon  actual  samples  of  the  oharsoters  rather  flna  on 
assumed  ideal  forms. 


Pattern  Qusntiisfioa 
FIGURE  1.12 
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FIGURE  1.14b 


An  important  olans  of  pattern  recognlsers  employe  the  elngle-ellt 
Boannlng  procedure  fSl.  The  character  is  scanned  by  a  thin  slit  which 
moves  across  it  as  shown  in  Fig.  1 . 14a.  The  relative  motion  of  the  slit 
is  perpendicular  to  its  length.  Tlie  result  of  such  scanning  is  an  analog 
voltage  waveform  oharacteristio  of  the  figure  being  scanned.  TUs  wavefomi 
is  then  compared  with  signals  representing  each  of  the  possible  oharaoters. 
Ibe  compare-character  signal  with  the  best  match  to  ttis  scanned  signal  is 
.  selected  as  the  machine-read  character.  With  the  slit-scan  teohni<iue,  Infor- 
mation  about  the  location  of  ink  along  the  length  of  the  slit  is  lost.  For  this 
reason,  the  dlsoriminating  power  of  recognition  systems  using  this  type  of 
scanning  is  somewhat  restricted.  In  general,  the  number  of  characters  must 
be  kept  small,  and  the  font  must  be  specially  designed  to  insure  that  the  sig¬ 
nals  produced  by  each  character  are  sufficiently  different  from  one  anottwr 
(8).  These  drawbacks  are  to  some  extent  compensated  for  by  ttie  greater 
speed  and  positioning  tolerance  achievable  in  comparison  with  eyatcms 
empl<^lng  two-dimensional  scanning. 
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There  are  two  common  methods  of  generating  the  analog  waveform* 

In  the  first  method,  optical  transducers  are  emplc^ed,  and  the  waveform 
voltage  at  each  instant  is  proportional  to  the  amount  of  ink  within  the  area  of 
the  slit.  Hie  result  of  scanning  the  character  "0"  by  this  method  is  indicated 
in  Fig.  1.14b. 

In  the  second  method,  the  characters  are  printed  in  magnetic  ink  and 
the  slit  is  replaced  by  a  magnetic  read  head.  The  waveform  voltage  at  each 
instant  is  then  proportional  to  the  derivative  of  the  black  area  under  the 
reading  head.  A  system  using  this  type  of  scanning  has  been  perfected 
Jointly  by  Stanford  Research  Institute  and  General  Electric  (5)  and  is  now 
widely  used  by  banks  in  check-processing.  A  specially  designed  type  font  is 
employed,  and  the  printing  must  be  held  to  very  close  tolerances.  The  most 
practical  method  for  comparing  the  scanned  waveform  with  the  prototype 
signals  is  based  on  the  concept  of  "matched  filters".  A  matched  filter  is 
"matched"  with  respect  to  a  given  signal  in  such  a  way  that  the  Impulse 
response  of  the  filter  is  the  time-inverse  of  the  signal.  That  is,  b(t)"As(b-t), 
where  h(t)  is  the  filter  impulse  response,  s(t)  is  the  signal  being  matched,  and 
A  and  b  are  constants.  If  the  input  to  this  filter  is  the  matched  signal  s(t), 

r*  r* 

thentheoutputo(t)»  =  J  As  . 

At  the  particular  instant  t"b,  we  find:  o(4r)~  j  ctT.  It  is  an  immediate 
consequence  of  Schwarts's  Inequality  that  for  all  t.  Suppose 

now  that  a  signal  q(t)/s(t)  is  applied  as  the  input  and  that  q(t)  has  been  nor- 
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maliied  ao  that  f  ‘  ^’***^"^  ou^ut  under  fliese 

conditionsby  g(t),  Schwartz's  Inecjuality  yields  the  relation  |o('^)|. 

Assume  now  that  matched  filters  have  been  constructed  for  each  of  the  com¬ 
pare-character  waveforms.  When  an  unknown  character  is  scanned,  the 
waveform  generated  is  applied  to  each  matched  filter  f; .  The  output  o^  (t) 
of  each  filter  is  normalized  and  monitored  during  an  intelrval  of  time  con¬ 
taining  the  "in-phase"  point  t-b,  and  the  maximum  value  m^  of  each  output 
held.  It  is  clear  from  the  preceding  remarks  that  the  largest  of  the  m^  will 
be  produced  by  the  matched  filter  corresponding  to  the  character  being 
scanned.  In  this  maimer,  recognition  is  accomplished. 

Matched  filters  can  be  approximated  using  conventional  filter  theory, 
or  more  simply  by  use  of  tapped  delay  lines  as  shown  in  Fig .  1 .  IS .  When 
sn  impulse  is  applied  to  the  input  terminal,  a  voltagepulss  is  inlttatod  which 

travels  along  the  line  at  the  line's  obaraoteristic  velocity. 

Iqput  Delay  line 


Difference 

AmpUfisr 


Ouffnt 


FIGURE  I.IS 


The  resistors  are  adjusted  so  that  the  output  of  the  aystcm  assumes  the 
proper  matched-filter  values  at  the  Instants  when  the  disturbance  in  the 
delay  lino  reaches  each  tap.  The  difference  amplifier  makes  it  possible  to 
simulate  matched  filters  having  both  posittvo  and  negative  outputs.  Tbs 
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accurate  of  the  repreaentation  ia  dependent  upon  the  number  and  eloaeneaa 
of  the  tape. 

The  character-recognition  schemes  discussed  so  far  are  obviously 
of  restricted  usefulness.  A  few  more  powerful  methods  will  now  be  considered. 
Foremost  among  these  in  importance  is  the  "property  list"  method  (19) . 
Basically,  this  method  consists  of  making  a  list  of  properties  charaoteristio 
of  each  pattern  to  be  recognized.  The  unknown  pattern  is  then  tested  for  the 
presence  or  absence  of  each  of  the  properties  on  each  list.  If  the  property 
lists  are  properly  constructed,  it  will  always  be  possible  to  identify  -ttm 
unknown  pattern  on  the  basis  of  the  outcomes  of  the  tests. 

For  any  given  character,  it  is  usually  possible  to  find  some  properties 
which  will  persist  even  when  the  characters  are  severely  distorted.  For 
example,  the  patterns  "C",  "€">  "c",  and  "C",  which  are  all  common  reprs- 
aentatlons  for  the  third  English  letter,  all  have  the  properfy  of  being  "open 
toward  the  right".  It  is  very  uncommon  to  encounter  a  "C"  wbioh  does  no^ 
have  this  partioular  property,  even  in  the  case  of  very  sloppy  handwritlag. 
Because  of  this  "persistence  of  properties",  the  property-list  method  is 
applicable  to  the  recognition  of  highly  distorted  characters. 

Many  variations  and  refinements  of  the  method  are  possible.  For 
example,  for  each  pattern  a  list  of  properties  not  charaoteriatic  of  the  pattern 
may  be  assembled.  If  the  unknown  pattern  possesses  properties  not  possessed 
by  a  given  prototype,  then  it  can  be  concluded  that  the  unknown  differs  from 
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that  prototype*  Ib  this  system,  the  unknown  is  Identified  by  a  process  of 
elimination. 

In  general,  it  is  advantageous  to  combine  the  use  of  oharaoteristio  and 
non-characteristic  features  in  one  recognition  scheme.  To  illustrate  the  general 

procedure,  assume  that  10  characters  N , ,  . are  to  be  recognised, 

on  the  basis  of  the  properties  F, ,  F2  •  F^  ,  F^  .  One  possible  reoognition  scheme 
is  illustrated  in  the  "character  tree”  diagram  of  Fig .  1.16.  In  this  situation, 
are  observe  that  characters  K , ,  , . . .  •  N|o  possess  property  F,  ,  while 

characters  , . . . , do  not;  ,  and  N,  possess  F2  •  whlle'N, , 

N2  ,  N3  ,N^  and  do  not,  etc. 


START 


Direct  Logic  Reoognition  Scheme 
FIGURE  1,16 
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A  logical  representation  for  the  characters  can  be  written  aa.  follows: 


N,-F,.F,.F3.F,j  n,-f..f,,f,.f, 

N3-F;.F,.F3;  ...  N,^-F,.F^.F,-F^ 

U  desired,  these  formulas  may  be  expressed  in  the  form  of  4-digit  binary 
number  codes  by  placing  a  1  in  the  i^l*  significant  digit  of  the  code,  if  the 
property  F^-  is  possessed  by  the  corresponding  character,  and  a  zero  other¬ 
wise.  In  this  manner  we  obtain  the  following  code  representation: 


N,  — —  0000 
——  0001 

Nj  —  0011  or  0010 


In  the  above  recognition  logic,  it  will  be  observed  that  at  the 
stage  in  the  recognition  process,  the  unknown  character  is  always  tested  for 
the  presence  of  a  specific  property  F|  ,  regardless  of  the  outcome  of  previous 
tests.  In  more  complicated  systems,  the  property  which  is  tested  for  at  each 
stage  is  a  function  of  previous  test  results,  as  in  the  following  scheme  involv¬ 
ing  8  characters  N , ,  • . . . .  Ng  and  5  properties  F, ,  Fj^ , . . . ,  . 
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FIGURE  1.17 


Three-digit  binary  number  oodee  for  the  characters  oaii  be  established  as 
follows:  the  i^  signifioant  digit  is  set  equal  to  1  if  the  property  tested  for  at 
the  i^  stage  is  present;  otherwise,  the  i^  significant  digit  is  made  sero. 

In  direct  logic  recognition  schemes  such  as  those  illustrated  above, 
the  binary  code  representation  for  each  character  must  be  distinct  in  order 
to  insure  correct  recognition.  Therefore,  if  the  number  of  patterns  to  be 
recognised  is  P,  and  H  is  the  integer  such  that  Z^'k  P  i  the  binary  codes 
must  be  at  least  I  digits  in  length.  This  means  that  the  recognition  prooess 
must  consist  of  a  sequence  of  at  least  i  tests.  It  is  convenient  to  write 
(log^p},  where  (x(  denotes  the  smallest  integer  greater  than  or  equal  to 
If  the  properties  to  be  tested  for  are  not  maximally  signifioant,  a  much 
larger  number  of  tests  may  be  necessary.  In  order  to  speed  up  and  simpllty 
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the  recognition  process,  it  is  usually  desirable  to  minimize  the  number  of 
tests  necessary  by  employing  very  significant  ones  only.  On  the  other  hand, 
the  most  significant  tests  may  be  very  inconvenient  and  complicated.  There¬ 
fore,  In  practice,  a  compromise  must  be  made  between  simplicity  and  signi¬ 
ficance  in  choosing  the  properties  to  be  tested  for. 

If  the  patterns  are  all  in  binary  matrix  form,  the  simplest  and  most 
convenient  property  to  test  for  is  the  presence  (or  absence)  of  a  "1"  in  a  par- 
ticular  cell  of  the  matrix.  If  the  resolution  of  the  pattern  matrices  is  suffi¬ 
cient  (i.  e. ,  if  the  character  field  has  been  subdivided  into  a  sufficient  number 
of  cells),  it  is  always  possible  to  recognize  a  set  of  distinct  patterns  by  test¬ 
ing  the  series  of  cells  in  this  manner.  The  problem  of  reducing  the  number 
of  tests  to  a  minimum  is  replaced,  in  such  a  recognition  system,  by  die 
problem  of  minimizing  the  number  of  cells  which  must  be  tested  in  order  to 
insure  identification  of  the  unknown  pattern. 

A  serious  shortcoming  of  direct  logic  recognition  schemes  is  fiist 
they  are  limited  by  the  weakness  of  their  worst  tests.  In  other  words,  if  sn 
unknown  character  failed  (because  of  noise  or  distortion)  to  have  one  of  the 
properties  oharaoteiistio  of  its  prototype,  an  error  might  result,  even  if  all 
the  other  significant  properttes  of  the  prototype  were  present.  The  spuriously 
absent  property  would  cause  the  recognition  process  to  follow  the  wrong  branch 
of  the  oharsoter  tree,  unless  complicated  error-detecting  and  error-oorrecting 
loops  are  included  in  the  system.  For  this  reason, property-list  recognition 
sohemes  emjdosring  direct  logic  can  only  be  used  when  a  sufficient  number  of 
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simple,  reliable  tests  can  be  found. 


One  way  to  get  around  this  difficulty  is  to  perform  the  tests  in  parallel, 
rather  than  sequentially,  as  in  the  direct  logic  system.  No  decision  oonoeming 
the  identity  of  the  unknown  character  is  made  until  the  results  of  all  the  tests 
have  been  obtained.  In  the  simplest  system,  the  number  of  properties  pos¬ 
sessed  jointly  by  the  unknown  character  and  each  prototj'pe  are  tabulated  from 
the  test  results,  and  recognition  is  based  upon  the  resulting  crude  similarity 
index.  This  is  the  simple  "majority  rule"  technique.  In  more  sofdiistiosted 
methods,  such  as  that  used  by  Worthle  Doyle  (6)  to  recognise  sloppy  hand¬ 
written  capital  letters,  statistical  weighting  factors  are  employed  in  the  compu¬ 
tation  of  the  similarity  index.  The  weighting  factors  are  obtained  by  combining 
assumed  a  priori  probabilities  for  the  characters  with  the  results  of  a  "census", 
in  which  each  of  the  tests  is  applied  to  a  large  representatlTe  sample  of  each 
character  and  the  outcomes  recorded. 

If  a  sufficiently  large  number  of  independent  tests  aroused,  the  proba¬ 
bility  of  misrecognlsing  a  character  due  to  spurious  test  outcomes  can  be  made 
small.'  This  holds  true  even  if  the  tests  are  not  very  reliable  and  significant. 
For  example,  Dqyle  selected  a  series  of  tests  on  the  basis  of  their  being 
easily  programmed  and  fast  running  on  the  general  purpose  digital  computer 
he  used  to  test  bis  system.  He  gave  little  ttiought  to  their  reliability  or  their 
power  to  differentiate  between  the  characters.  Nevertheless,  he  achieved 
oorreot  recognition  rates  of  approximately  97%  for  handwritten  capital  letters. 
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One  great  disadvantage  of  most  property-list  recognition  schemes 
is  their  relative  inflexibility.  Once  such  a  system  has  been  built  to  recognise 
a  particular  set  of  characters,  it  usually  cannot  be  used  to  recognise  a  differ¬ 
ent  set  of  characters  without  radical  alterations.  This  is  particularly  true  of 
direct-logic  recognition  systems.  For  applications  in  which  it  is  desirable  to 
have  a  single  machine  which  can  be  used  (with  only  minor  adjustments,  if  aiqr) 
to  recognize  arbitrary  sets  of  characters,  a  machine  capable  of  "learning"  to 
recognize  characters  would  be  very  valutdile.  A  few  such  "self-adqptlng" 
systems  have  been  built  for  experimental  purposes.  In  general,  the  system 
is  trained  as  follows:  A  large  number  of  representative  eharahter  samples 
are  presented  to  the  system.  The  machine  is  told  the  identity  of  each  oharao- 
ter  as  it  is  presented,  and  on  the  basis  of  this  information  "learns"  to  recog¬ 
nize  the  various  characters.  If  the  machine  is  to  employ  a  property-list 
recognition  scheme,  this  learning  phase  must  involve  a  search  for  slgnifloant 
properties.  As  yet  no  useful  machine  has  been  developed  which  can  do  fliis 
satisfactorily,  although  a  number  of  workers  have  studied  the  problem*. 

The  property  significance  criterion  introduced  in  this  roportmi^t  be  used  to 
advantage,  in  a  system  of  this  type. 

At  present,  the  most  successftil  self-adaptive  syatems  simply  subject 

the  sample  characters  to  an  arbitrary  sequence  of  simple,  inredetermined 

tests.  Recognition  of  unknown  characters  is  based  on  a  statistical  analysis 

of  the  test  outcomes  obtained  during  the  learning  phase.  Doyle's  system, 

*  See  page  93  of  0.  G.  Selfridge,  "Pattern  Recognition  and  Modem 
Gomputem".  (22) 
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described  above,  is  of  this  type,  as  are  those  of  Baran  and  Breuer»  which 
will  be  discussed  in  Section  1.5.4. 

Another  class  of  self-adapting  systems,  typified  by  the  Perceptron  of 
Rosenblatt  (21),  are  based  upon  assumed  models  of  tl)e  brain,  with  electronic 
elements  simulating  neurons,  synapses,  etc.  The  Pereeptron  model  in  sim¬ 
plified  form  is  shown  in  Fig.  1.18. 


SENSORX  luruTi  A-UNITS  R-UMITS 


FIGURE  1.18.  Perceptron  Model 

Sensory  inputs  are  mapped  by  means  of  random  connections  onto  a 
series  of  simulated  neurons  called  A-units.  Like  their  biological  counter¬ 
parts,  each  of  the  A-units  is  quiescent,  unless  the  sum  of  the  "stimuli" 
reaching  it  from  the  sensory  elements  exceeds  a  certain  threshold.  The 
output  X.  of  each  A^  ,  weighted  by  a  variable  factor  w,;  ,  is  applied  to  each 
of  another  set  of  neurons  called  R-units  (to  simplify  the  diagram,  only  one 
R-unlt  is  shown  in  Fig .  1.18) .  The  output  0^  of  R  ^  is  determined  as  fol¬ 
lows: 
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0^-1  if 

0^  -  Oil  <9^  <J  ■  1.2,...,in) 

where  O.^  ■  threshold  of  ,  m  “  total  number  of  R-unlts. 

During  the  learning  phase,  the  values  of  the  W^.  are  changed  when- 

0 

ever  the  output  of  R^  does  not  correspond  to  some  arbitrary  desired  response 
for  the  given  input  pattern.  If  the  patterns  to  be  recognized  are  sufficient¬ 
ly  distinct  and  sufficiently  undistorted,  if  a  sufficient  nunnber  of  sensory  and 
neural  elements  are  employed,  and  if  a  suitable  method  for  adjusting  the 
values  of  the  is  used,  the  system  will  converge  to  a  state  in  which  It 
will  yield  the  desired  responses  to  input  patterns. 

As  yet,  the  optimal  method  for  a4}ustlng  the  is  not  known.  How- 
ever,  a  number  of  rules  have  been  tried,  mostly  with  some  auooess. 
Verification  of  network  learning  has  been  achieved  both  by  means  of  digital 
computer  simulations  and  by  experiments  on  a  hardware  model  (4).  The 
latter  has  successftilly  learned  to  recognize  the  letters  of  the  English  alpha¬ 
bet,  in  fixed  position  and  font. 
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1.5.3  Automatic  Character  Recognition  In  Review 

The  purpose  of  this  section  la  to  review  a  few  recent  developments 
In  the  field  of  automatic  character  recognition.  Comprehensive  reviews  of 
the  literature  through  approximately  August  1960  have  been  given  by  Breuer 
and  by  Baren  (3, 1).  Therefore,  only  very  recent,  articles  are  dealt  with 
here.  It  Is  hoped  that  the  systems  discussed  here^  In  addition  to  those 
referred  to  In  other  sections  of  this  chapter,  will  provide  the  reader  with 
sufficient  background  to  facilitate  his  tmderstandlng  of  subsequent  chapters 
of  this  report . 

In  the  last  section  It  was  pointed  out  that  It  la  highly  advantageous  to 
employ  encoding  schemes  which  are  Invariant  with  respect  to  the  most  com¬ 
mon  distortions  or  variations  In  the  characters  to  be  recognised.  An  encod¬ 
ing  scheme  which  hasthe  property  of  registration  Ihvsurlanoe  fl.  e.  Invariance 
with  respect  to  vertical  and  horlsontal  translations  of  the  character)  Is  des¬ 
cribed  by  L.  P.  Horwlts  and  G.  L.  Shelton  of  IBM  (14).  Suppose  the  charao- 
ter  Is  quantised  Into  an  NxN  mosaic  of  black  and  white  squares  as  described, 
prevlousl]'.  We  establish  a  Cartesian  coordinate  system  on  a  (2N-l)x(2N-l) 
field  so  that  each  elemental  area  of  the  field  Is  associated  with  a  vector 
X -{x,y),  as  In  Fig.  1.19.  Assuming  that  the  NxN  pattern  matrix  Is  located 
somewhere  on  the  (2N-l)x(2N-t)  field,  we  define  a  ftmotibn  f  (  X  3  as 
fbllows:  f(A  )*f(x,y)*t  if  the  elemental  area  whose  coordinates  are  (x,y)  Is 
black;  otherwise  f(A)*0.  Consider  the  function  whose 
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domain  is  the  (2N-l)x(2N-l)  field.  Evidently,  D(!t)  equals  the  number  of 
pairs  of  ones  on  the  pattern  matrix  with  relative  sepuation  R  this 

number  is  obviously  invariant  with  respect  to  shifts  of  the  NxN  pattern  matrix 
on  the  (2N-l)x(2N-I)  field,  since  the  relative  separation  of  a  pair  of  ones  is 
registration  invariant.  D{R)  is  known  as  the  discrete  autocorrelation  function 
of  the  pattern  from  which  it  was  derived.  It  is  necessary  to  extend  f(r)  from 
the  NxN  domain  to  the  (2N-l)x(2N-l)  matrix  since  the  vector  (r-R)  does  not 
necessarily  lie  on  the  original  NxN  matrix. 

The  discrete  autocorrelation  function  is  not  in  general  a  1-to-l  func¬ 
tion  of  the  set  of  patterns  to  be  recognized;  indeed,  for  any  particular  pattern 
P,  Dp(^)  -  E),(R)^where  P'  is  a  new  pattern  formed  by  rotating  P  through  isrf  , 
Hence,  D(]^.)  does  not  always  form  an  adequate  description  of  a  set  of  patterns 
for  recognition  purposes  —  for  example,  one  could  not  distinguish  a  "6"  from 
a  "9"  given  only  the  discrete  autocorrelation  function.  However,  given  a  set 
of  prototype  patterns  whose  discrete  autocorrelation  functions  are  distinct, 
it  is  quite  practical  to  base  a  recognition  scheme  upon  the  autocorrelation 
function,  even  in  the  presence  of  considerable  noise  and  distortion.  Horwits 
and  Shelton  took  as  a  measure  of  the  "similarity"  between  two  patterns  A  and 
B  the  quantity  5;,^*  ^  *****  ***** 


represents  the  cosine  of  the  angle  between  two  multidimensional  vectors. 

By  Schwarz's  Inequality,  j ,  with  S,,^  =1  only  if  A  and  B  are  identical 

except  for  rotation  through  ISO'’  or  shifting.  If  the  autocorrelation  functions 
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D  j  (R),  (R), . . . ,  n„^(R)  of  the  prototype  patterns  P,  ,  , . . . ,  are  known, 

then  an  unknown  pattern,  U,  can  be  Identified  by  computing  its  "similarity" 
Sul  •  Sui  •  •  •  •  •  Sum  to  of  the  possible  prototypes  and  associating  it  with 
the  prototype  to  which  it  is  most  "similar".  To  prove  the  feasibility  of  the 
recognition  scheme,  a  small  number  of  alphabetic  characters  were  digitally 
processed  by  Horwitz  and  Shelton  with  encouraging  resiUts. 

Work  is  now  being  done  on  the  problem  of  constructing  a  practical 
registration-invariant  reading  machine  based  on  the  autocorrelation  function. 
It  can  be  shown  tliat  in  the  limit,  as  the  resolution  of  the  pattern  matrix  is 
increased,  the  quantity  Sab  defined  above  can  be  found  by  means  of  specially 
designed  optical  filters.  The  character  A  to  be  recognized  must  be  in  the 
form  of  a  negative  photographic  transparency.  The  Fraunhofer  diffraction 
pattern  produced  by  this  negative  is  projected  by  a  collimating  lens  through  a 
normalized  optical  filter  made  by  photographing  the  Fraunhofer  diffraction 
pattern  of  the  prototype  character  B.  The  total  light  energy  passing  through 
the  system  is  then  proportional  to  .  In  practice  the  character  could  be 
recognized  by  the  arrangement  shown  in  Fig .  1.20,  in  which  the  unknown 
character  is  compared  in  parallel  with  each  of  the  prototypes. 

At  present  the  optical  method  is  not  practical,  since  the  document 
must  be  in  the  form  of  a  transparency.  Also,  the  light  levels  inherent  to  the 
system  are  very  low.  An  electronic  system  for  generating  autocorrelation 
Ainotlone  has  therefore  been  developed  by  Shelton,  McDermid  and  Petersen. 
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FIGURE  1.20 

Parallel  recognition  through  Fraunhofer  diffraction 


The  unknown  character,  In  hIX.(2N>l)  binary  matrix  form,  Is  entered,  bit 
bit,  Into  a  shift  register  as  shown  In  Fig .  1.21.  The  order  In  which  the  bits 
enter  the  register  is  shown  by  the  numbering  In  Pig .  1.22.  The  first  posi¬ 
tion  in  the  register  is  ANDED  independently  to  all  the  other  register  positions, 
and  thie  outputs  of  the  ANDS  each  fed  into  a  counter.  After  the  matrix  has 
completely  entered  the-  shift  register,  the  numbers  in  the  counters  will  be 
(reading  from  left  to  right): 

(1)  The  total  number  of  bits  in  the  pattern^ 

(2)  Number  of  l*s  separated  by  1  shift  in  -t-Y  direction. 

(3)  Number  of  I's  separated  by  2  shifts  in  -i-y  direotidn. 

(N )  Number  of  I's  separated  by  1-unit  shift  in  >^>direction  and  N-i  shift 
in  -V  direction. 

(N-fl)  Number  of  I's  separated  by  1-unit  shift  in  X-direotion  and  N-2  shift 
in  -Y  direction. 

and  so  forth,  every  2N^  position  corresponding  to  an  individual  X- 
shift.  The  similarity  function  defined  above  can  be  computed  hy  multi¬ 
plying  the  numbers  in  the  counters  by  the  appropriate  factors  for  each  com¬ 
parison  channel,  summing,  and  normalizing  as  shown  in  the  block  dif^pram 
of  Fig.  1.21. 

In  this  system,  it  is  possible  to  distinguish  between  two  characters 
which  differ  only  by  a  rotation  through  ISCT  by  looking  at  the  partial  autocor¬ 
relation  functions  which  exist  in  the  counters  as  the  input  enters  the  register. 
In  general,  at  corresponding  times,  these  will  be  distinct,  even  for  this 
ambiguous  case. 


97 


FIGURE  1.21 
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Omitted  units  are  indicated  by  dotted  lines* _ 
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A  line-dr awlAg  pattern  recognizer  developed  by  L.  D.  Harmon  of 
Bell  Telephone  Laboratories  (13)  makes  use  of  a  scanning  system  yielding 
character  descriptions  which  are  essentially  invariant  with  respect  to  rota¬ 
tion  and  changes  in  size.  At  present  the  system  is  capable  of  recognizing 
line  drawings  of  circles,  triangles,  squares,  pentagons,  and  hexagons.  It 
can  also  count  up  to  six  small  opaque  objects.  With  more  sophisticated  cir- 
cuitiy,  it  appears  possible  to  extend  the  system  to  recognize  hand-written 
or  printed  alphanumeric  characters  as  well. 

The  scanning  field  is  an  arrangement  of  c  concentric  rings  of  r  ele¬ 
ments  each  (Fig.  1 .73) .  Scanning  consists  of  sequentially  obse^ing  the 
elements  in  each  ring,  starting  from  the  innermost  ring  and  proceeding 
outward.  Thus  uy  pattern  in  the  scaiming  field  is  transformed  into  a 
sequence  (in  time)  of  c  binary  numbers,  each  r  digits  in  length.  If  these 
numbers  are  arranged  in  the  form  of  a  matrix  as  shown  in  Fi  {; .  1 . 24 ,  it  is 
evident  that  changes  in  size  of  the  scanned  figure  will  result  merely  in  verti¬ 
cal  shifts  of  the  pattern  of  ones  on  the  matrix,  as  long  as  the  figure  remains 
on  the  scanning  field.  Similarly,  rotations  of  the  figure  being  scanned  will 
result  in  horizontal  shifts  of  the  pattern  of  ones.  In  case  a  portion  of  the 
pattern  is  shifted  off  the  matrix,  due  to  a  rotation  of  the  figure  on  the  scan¬ 
ning  field,  it  will  re-appear  on  the  opposite  side.  Hence  this  type  of  scanning 
jdelds  descriptions  which  are  indeed  quite  invariant  with  respect  to  rotation 


and  size  changes. 


FIGURE  1.23 


FIGURE  1.24 

Matrix  obtained  frvs  ecannlnc;  field  by  cutting  along 
dotted  line  in  Pig.  1.23  and  atralgbtening* 


FIGURE  1.25 
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The  results  of  scanning  a  few  centered  numerals  are  shown  in  Fig. 

1 . 25 .  It  will  be  observed  that  there  exist  certain  properties  which  distinguish 
the  patterns  corresponding  to  various  forms  of  the  figure  "2"  from  those  due 
to  scanning  of  a  "5".  The.  two  valued  function  for  Oi  I't s  and  establishes 

the  presence  of  the  top  bar  and  top  of  the  curved  portion  of  a  5,  for  example. 
This  feature  would  be  absent  from  a  2,  which  instead  would  probably  have  an 
ascending  function  for  S  as  shown,  etc.  If  tilting  is  present,  relative 
rather  than  obsolute  values  of  r  must  be  considered,  of  course,  but  this  would 
not  be  a  difficult  task  for  a  reading  machine  to  perform  (in  fact,  the  registra¬ 
tion-invariant  autocorrelation  function  of  Horwitz  and  Shelton  might  be 
employed  at  this  point).  In  principle,  at  least,  it  would  seem  that  the  scanning 
system  considered  here  could  form  the  basis  of  a  useful  direct  or  majorily- 
logic  recognizer  for  characters  stibject  to  tilting  and  variations  in  size. 

The  simple  polygon-recognizer  and  object  counter  actually  built  by 
Harmon  and  his  associates,  employs  a  mechanically  puckered  ring  of  32 
photocells  as  a  scanning  mechai'-ism.  Whenever  a  particular  photocell 
encounters  a  black  area,  this  fact  is  recorded  in  associated  circuitry.  If 
we  associate  the  value  0  with  all  photocells  which  have  not  encountered  a 
black  area,  and  the  value  1  with  those  which  have,  it  is  evident  that  counting 
a  number  of  objects  having  sufficient  angular  separation  is  e<iuivalent  to 
counting  a  series  of  strings  of  O's  and  t's  (Fig.  1.26),  Recognition  of  poly¬ 
gons  is  based  on  the  observation  that  as  the  scanning  ring  intersects  a  vertex. 
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two  strings  of  1*8  must  flow  together  ( Pig.  1,27),  By  coxmting  the  number 
of  times  this  occurs  during  the  scanning  cycle,  a  convex  polygon  or  circle 
can  be  identified.  All  counting  operations  mentioned  here  are  performed  by 
an  arrangement  of  thyratrons,  neon  bulbs,  and  photocells. 

A  pattern-recognition  system  bearing  certain  similarities  to  Harmon's 
has  been  designed  by  J.  R.  Singer  of  the  University  of  California  at  Berkeley 
(23) .  Singer's  scheme  employs  an  interesting  image-sensing  imit  which  is 
based  upon  a  model  of  the  human  retina  and  optic  nerve.  This  sensory  unit 
is  composed  of  a  group  of  basic  contour-sensing  units,  one  of  which  is  shown 
in  Fig .  1.28  .  It  will  be  observed  that  an  output  will  be  produced  unless  all 
four  of  the  photocells  a,b,  c,  and  d  are  equally  illuminated.  In  other  words, 
there  will  be  an  output  from  the  logical  imit  If  there  is  any  variation  in  light 
intensity  on  the  field  scanned  by  the  unit.  By  similarly  differentiating  the 
outputs  of  four  of  these  basic  logic  units,  higher-order  outyuts  are  obtained; 
by  repeating  this  process,  extremely  complex  and  discriminating  sensory 
systems  can  be  produced.  The  output  lines  of  the  differentiating  circuits  at 
various  levels  are  termed  "signal  conductors"  or  "optic  fibers".  Singer  has 
adopted  a  sensory  system  consisting  of  72  photosensitive  elements  aind  SS 
optio  fibers  arranged  in  a  circular  configuration  ( Fig .  1.29). 

llie  optic  fibers  are  connected  to  a  device  called  a  "delay  transformer" 
which  transforms  the  image  from  "optic  fiber  space"  into  a  "delay  space". 
Essentially,  "delay  space"  is  a  space  in  which  the  image  retains  its  form. 
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FIGURE  1.26 
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but  expands  In  size  at  a  rate  which  Is  a  function  of  the  time.  A  simple  del«y 
transformer  might  be  constructed  by  connecting  each  optic  fiber  through  a 
del^y  line  or  through  a  series  of  delay  elements  to  an  optic  fiber  on  the  same 
azimuth,  but  located  at  a  larger  distance  from  the  center  (  Fig .  1.30).  As 
the  various  optic  fiber  pulses  pass  through  the  periphery,  the  former  purely 
spatial  relationships  of  the  figure  contour  are  converted  into  temporo>spatlal 
relationships.  As  in  Harmon's  expanding  circular  scanning  system,  the 
figure  Is  transformed  into  a  space  in  which  size  variations  appear  as  time 
delays,  and  the  effects  of  rotation  are  trivial.  In  fact,  by  properly  storing 
the  delay-space  outputs,  a  binary  matrix  representation' of  the  input  pattern 
can  be  obtained  which  Is  identical  with  the  binary  matrices  obtatoed  by  Har¬ 
mon. 

Singer  has  suggested  a  self-organizing  recognition  scheme  for  use 
with  the  above  input  system.  During  a  learning  phase,  a  large  number  of 
image  code  matrices  are  stored  in  a  ferrite  memory  plane.  Before  each  new 
image  code  is  stored,  it  is  tested  to  m^e  sure  that  it  is  sufficiently  different 
from  the  image  codes  already  stored.  In  this  way,  excessive  redundancy  in 
the  memory  is  avoided.  It  should  be  observed  that  two  forms  of  the  same 
character  (such  as  "A"  and  "r")  would  be  stored  as  separate  images  in  this 
Cystem.  A  control  unit  later  re-examines  the  memory  for  such  redundancies 
In  order  to -reduce  them  by  logical  analysis. 

Recognition  proceeds  as  follows:  The  complement  of  the  code  matrix 


105 


FIGURE  1.29 


Solid  black  clrelca  represent  optic  fibers* 

Sir&ll  white  circles  represent  photosensitive  elements* 
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S  m  delay  tlma 

Solid  black  clrclea  represent  optic  fibers. 

Open  circles  represent  delay  space  output  fibers 


of  the  image  to  be  recognized  is  taken.  This  complement  is  then  added  in 
turn  to  each  of  the  stored  image  matrices  having  the  same  order  as  the 
complement  matrix.  If  the  total  sum  of  all  the  digits  in  the  sum  matrix  is 
not  within  the  range  mn  ±  e,  where  m  is  the  number  of  rows,  n  is  the  num¬ 
ber  of  columns,  and  e  is  a  "noise  number"  which  is  optimized  the  machine, 
then  the  symbol  is  not  "recognized".  In  case  the  input  matrix  is  not  recog¬ 
nized  as  any  of  the  stored  matrices,  it  is  stored  as  a  new  prototype  ma:trix. 

Hie  optimal  value  of  e  depends  both  on  the  stored  images  and  on  the 
nature  of  the  images  to  be  recognized.  If  e  is  too  large,  mlsreoognltion  or 
multiple  recognition  will  be  common;  if  e  is  too  small,  recognition  will  not 
be  achieved,  and  the  memory  will  be  loaded  down  with  redundant  images.  A 
program  for  optimizing  the  value  of  e  on  the  basis  of  experience  gained  by 
prooesslng  a  large  number  of  samples  is  incorporated  in  ttie  maohine. 
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This  report  is  a  continuation  of  research  work  done  Baran,  Breuefi 
and  Manelis  at  the  University  of  California  at  Los  Angeles  Digital  Technology 
Laboratory  (1,3,18). 

Since  computer  routines  and  techniques  developed  by  these  workers 
have  been  used  in  this  report,  a  brief  review  and  explanation  of  their  work 
is  essential  and  is  provided  in  this  section. 

Baran's  Work; 

The  purpose  of  Baran's  thesis  was  the  development  and  demonstration 
of  a  pattern  recognizing  technique  which  would  be  suitable  for  use  with  lan¬ 
guage  translation  machines.  Baran's  technique  was  developed  with  the 
following  observations  in  inind: 

I 

(1)  To  be  useful,  the  technique  would  have  to  be  capable  of  recognising 
punctuation,  as  well  as  alphabetic  and  numerical  characters  of  vary¬ 
ing  styles  and  fonts. 

(2)  A  reasonable  amount  of  noise  in  the  form  of  deformed  oharaotert, 
misalignments,  smears,  etc. ,  must  be  allowable  without  undue 
deterioration  of  reading  ability. 

(3)  The  technique  would  have  to  Involve  a  learning  proceas  in  which  the 
reading  machine  "adapts"  itself  to  the  range  of  type  fonts  and  printing 
distortiona  characteristic  of  the  material  to  be  translated.  Otherwise 
the  reading  machine  could  only  be  used  to  read  one  particular  type 


font,  whereas  often  one  encounters  several  different  type  fonts  (e.g. 
roman,  italic,  bold-face)  on  one  page  of  a  Journal  or  document. 

(4)  Accuracy  need  not  be  maintained  at  the  level  necessary  in,  for  exam¬ 
ple,  a  check-reading  machine  where  a  single  misread  digit  might  be 
disastrous.  The  high  redundancy  of  written  languages  would  make 
error  detection  simple,  particularly  if  some  indication  of  the  likeli¬ 
hood  of  a  reading  error  were  provided  by  the  reading  machine. 

The  theoretical  basis  of  Baran's  recognition  scheme  is  M  follcms: 

Let  us  assume  that  we  are  dealing  with  characters  which  have  been  expressed 
in  the  form  of  binary  matrices,  by  superimposing  them  on  a  rectangular  grid 
containing  IxJ  cells  (Fig.  1.13).  Let  us  further  assume  that  we  desire 
to  recognize  a  set  of  K  characters  whose  a  priori  probdiill- 

tles  of  occurrenos  { tln^  are  known.  The  recognition  scheme  consists 

of  two  phases.  In  the  preliminary  or  "learning"  phase,  we  feed  t  represen¬ 
tative  samples  of  each  of  the  K  characters  into  the  reading  machine  (a  digital 
computer).  The  machine  computes  the  conditional  probabilities 

■plAylN^) 

- 

'i  ■  the  event  "cell  C;j  is  white" 

Where 

X-;  j  ■  the  event  "cell  c;^  is  black*’ 
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f-t-l  if  cell  Cij  of  thei^  sample  of  Is  black. 

0  if  cell  C;^  of  the  i^b  sample  of  is  white. 

The  learning  phase  is  completed  by  inputting  the  known  a  priori  probabili¬ 
ties  p{N,),p(Nj) . p(N*). 

In  the  recognition  phase,  an  unknown  characLer  N  is  fed  into  the 
machine.  Recognition  is  accomplished  by  computing  the  a  posteriori  proba¬ 
bilities,  (based  upon  the  information  contained  in  the  cells  of  N  )  of  the  K 
hypotheses: 

H, ;  =  w, 

Hk  •  Ny  * 

The  process  is  completed  by  assuming  that  the  l^pothesia  H^wltb 
the  highest  a  posteriori  probability  is  the  correct  one.  The  mathematioal 
basis  of  the  digital  computation  is  now  given. 

Suppose  that  the  first  cell  c„  of  is  black.  On  the  basis  of  this 
cell  only,  we  can  use  fundamental  rules  of  probability  theory  to  compute  the 
a  posteriori  probabilities  of  the  hypotheses  H , ,  , . . . ,  .  We  have: 


/si  A-I  ^  " 


Similarly,  if  cell  c  ^ 

P(n*h)  = 


of  Ny  is  white,  we  have 


r  p(N*ypKiM*) 
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To  ■impliiy  matters,  let 

V.  ^ - — 

j»-i  * 

The  quantities  and  X,,;^  are  called  respectively  the  "white  cell  weight 
factor"  and  the  "black  cell  weight  factor"  of  CiJ  ,  since  thciy  are  related  to 
the  "weight"  or  "importance"  of  the  cell  CiJ.  More  will  be  said  about  the 
significance  of  the  cell  weight  factor  later. 

It  is  also  convenient  to  introduce  a  variable  Vij  such  that 
^  fb,  j  if  cell  c;j  of  Nj^  is  black 

(w;^  if  cell  c;j  of  is  white 
We  can  then  write  the  single  formula 

Ptfitlv,,)-  X„„  P/''JJN4)PK) 

If  we  make  the  simplifying  approximation  that  the  cells  are  statistioalfy 
independent,  then  we  can  compute  the  second-order  a  posteriori  probability 
P^(N^)  which  is  based  upon  the  information  contained  in  the  first  two  cells, 
by  a  similar  analysis.  However,  we  now  use.  the  first-order  a  posteriori 
probability  P,  (N^)  instead  of  the  a  priori  probabilify  P  ).  Henoe, 

Continuing  in  this  manner,  it  is  evident  that  the  final  a  posteriori  probablllhr 
of  H  based  upon  the  entire  matrix  is  given  by 

n  n"  •  (i  -ij 
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1 


As  stated  previously,  the  prime  choice  for  the  identity  of  is  the  choraotor 
having  the  maximum  value  of  .  It  will  be  observed  that  in  addition  to 
an  educated  guess  Ng  as  to  the  identify  of  the  recognition  scheme  yields 
an  estimate  of  the  confidence  of  the  guess  in  the  form  of  Further¬ 

more,  by  examining  the  values  of  (N^  )  for  10^,  we  can  find  the  most 
likely  alternatives  to  in  case  there  is  reason  to  suspect  this  estimate  of 
being  incorrect.  These  properties  are  very  useful  in  language  translation 
reading  schemes,  as  was  noted  earlier. 

As  indicated  in  the  derivation,  the'Baran  scheme  is  based  upon  file 
obviously  inaccurate  assumption  of  statistical  independence  of  the  cells. 
However,  for  reasonably  undistorted  unknowns,  this  is  not  likely  to  cause 
errors  in  recognition.  To  demonstrate  this,  Baran  programmed  file  IBM  700 
computer  to  perform  the  computations  indicated  above.  As  input  oharaoters, 
he  utilized  96  samples  of  each  of  the  ten  numerals.  These  samples  were  in 
the  form  of  binary  digit  matrices  which  had  been  punched  on  cards  by  IBM 
for  computing  purposes.  Varying  degrees  of  distortion  were  represented, 
ranging  from  near-perfect  samples  to  severely  under-inked  ones.  A  few 
samples  of  these  numerals  are  shown  in  .Fig.  1.31,  Using  half  of  the  sam^ 
pies  in  the  learning  phase,  and  the  other  half  as  unknowns,  Baran  was  sble 
to  achieve  a  correct  recognition  rate  of  81%.  This  is  oertalnly  sutfioient  for 
file  purposes  of  a  language  translation  reader. 

Baran  oonaidered  the  possibilities  of  producing  special  purpose  read- 
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ing  machines  using  optical  filters  based  upon  his  recognition  scheme.  Readers 
interested  in  this,  or  in  the  details  of  the  computer  program,  the  character 
samples,  etc. ,  are  referred  to  his  thesis. 

Breuer's  Work; 

Breuer's  thesis  covered  abroad  range  of  topics,  including  an  alter¬ 
nate  (masking)  recognition  scheme,  computer  routines  to  aid  recognition  by 
filtering  and  centering  input  characters,  and  a  "binary  division"  routine  for 
finding  the  most  significant  areas  of  the  character  field  from  the  standpoint 
of  differentiating  the  characters;  This  binary  division  routine,  in  addition 
to  other  work  done  by  Breuer  on  the  significant  area  problem,  is  discussed 
elsewhere  in  this  chapter. 

Breuer's  recognition  scheme  is  actually  a  simplified  form  of  Baran's 
scheme.  As  in  Baran's  scheme,  there  is  a  preliminary  learning  phase  in 
which  representative  samples  of  each  of  the  characters  to  be  recognised, 
expressed  in  the  form  of  binary  matrices,  and  properly  identified,  are  fed 
into  the  mMhine.  The  conditional  probabilities  sre 

estimated  by  the  machine  as  discussed  previously.  Recognition  of  an  unknown 
character  is  based  upon  the  simple  formula 


f  biji  if  cell  C//  of  N,.  is  black  'I 
Where  'i;  ■  /  ' 

'  if, 


cell  Cj^  ofN,,  is  white/ 
The  unknown  character  N/is  identified  as  such  thst 


as  defined  previously. 


* 


116 


/ 


It  is  evident  that  the  above  formula  is  actually  an  expression  for 
P(v  where  V  *  'mH  j  ^  is  a  vector  representing 

the  observed  state  of  the  cells  of  .  (Again  the  approximation  of  indepen¬ 
dence  of  the  cells  is  made).  In  general,  one  would  not  e}q[>ect  Breuer's 
scheme  to  be  as  accurate  as  Daran's  scheme,  which  yields  values  for  the 
more  pertinent  quantities  Inasmuch  as  Breuer's  scheme  makes  no 

provision  for  variations  in  the  a  priori  probabilities  of  the  individual  patterns, 
it  may  break  down  if  the  patterns  are  not  equi~probable.  However,  as  Breuer 

i 

showed  e}q>erimentally,  his  recognition  scheme  is  about  as  accurate  u  Baran's 
for  eout-probable  patterns  in  which  the  distortion  rates  are  not  excessive. 

Breuer  programmed  his  recognition  scheme  in  the  form  of  a  masking 
technique  by  taking  the  logarithms  of  the  quantities  and  . 

Recognition  is  based  on  the  formula  -  T.  I  ioy 

Since  the  logarithm  is  a  monotonically  increasing  function,  recogni¬ 
tion  can  be  based  on  the  value  of  log  P(H^ )  Just  as  well  as  on  that  of 
Provision  must  be  made  for  the  cases  and  l^||^=o,  since  tbs 

logarithm  of  0  is  not  defined.  11)10  was  done  by  adding  a  small  positive 
amount  of  "noise"  i  to  all  of  the  a  posteriori  probabilities  PP'ij  |K^)  and 
P  (Hi  j  I  )  before  taking  logsrithms. 

Breuer  simulated  several  filtering  schemes  designed  to  improve 
recognition  %  removing  randosi  ink  blotches,  filling  in  missing  portions  of 
the  character,  and  smoothing  the  edges.  He  also  programmed  a  oentroidal 
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translation  routine  which  shifts  an  input  character  so  that  its  centroid  coin¬ 
cides  with  the  center  of  the  scanning  field.  The  usefulness  of  these  routines 
was  established  by  Breuer  in  an  ei^eriment  in  which  a  reduction  in  error 
rate  of  about  30%  was  achieved  through  their  use. 

Manelis*  Work; 

The  object  of  Manelis*  thesis  was  to  develop  computer  routines  for 
the  generation  of  character  samples  having  controlled  or  random  amounts  of 
distortion,  starting  with  a  set  of  ideal  characters.  The  characters  so  gener¬ 
ated  are  useful  inputs  for  testing  character  recognition  schemes.  In  faot, 
the  impetus  for  development  of  the  character  generation  routines  arose  from 
the  inadequapy  of  the  sample  numerals  used  by  Baran  and  Breuer  for  oonolu- 
sive  testing  of  their  character  reoogpiition  techniques. 

Specifically,  the  samples  used  by  Baran  and  Breuer  had  the  following 
shortcomings:  First,  the  number  of  samples  used  was  necessarily  limited 
to  the  number  available  on  punched  cards  (96  samples  of  each  numeral).  As 
these  samples  alone  required  28,800  cards,  it  is  obvious  that  the  number  of 
samples  was  not  readily  extendable.  For  statistical  reasons,  however,  it 
mpy  be  desirable  to  have  a  much  larger  number  of  samples  available  for  botti 
the  learning  and  recognition  phases  when  testing  recognition  rates.  Secondly, 
die  available  samples  represented  only  one  of  many  possible  types  of  devia¬ 
tion  from  the  ideal,  vis.  that  due  to  a  more  or  lesa  worn  out  typewriter  rib* 
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bon.  Hence,  if  one  desired  to  test  a  recognition  system  designed,  say,  to 
operate  in  a  milieu  of  over-inked  rather  than  under-inked  characters,  the 
numeral  samples  of  Baran  and  Breuer  would  obviously  not  be  ideal  guinea 
pigs.  Finnlly,  the  only  means  Baran  and  Breuer  had  of  controlling  the 
amount  of  distortion  prevalent  in  their  samples  was  through  increasing  or 
decreasing  the  number  of  samples  used.  Increasing  the  number  of  samples 
Increased  the  average  amount  of  distortion,  and  vice  versa,  since  the  sam¬ 
ples  became  progressively  more  deteriorated  as  the  sample  numbers 
increased.  However,  this  provided  only  a  rough,  qualitative  control  over 
the  amount  of  distortion  present.  Some  sort  of  quantitative  measure  of  the 
distortion  rate,  on  the  other  hand,would  be  very  useful  as  it  would  permit  the 
e}q)erimenter  to  make  objective  statements  about  the  ability  of  his  recogni¬ 
tion  scheme  to  read  distorted  characters  successfully.  By  measuring  the 
amount  of  distortion  present  in  a  typical  document,  the  experimenter  would 
be  able  to  predict  whether  or  not  his  reading  machine  was  accurate  enovq^ 
for  the  Intended  purpose.  * 

All  of  the  above  shortcomings  can  be  obviated  by  use  of  - character 
generation  techniques  of  the  type  developed  by  Manelis.  Manelis  started 
with  a  set  of  33  ideal  Cyrillic  (Russian)  characters  which  he  quantised  and 

*  In  case  the  reading  machine  incorporates  a  filter  and/or  smoother, 
the  experimenter  would,  of  course,  measure  the  distortion  rate  of 
the  output  of  the  filter  and/or  smoother  when  the  input  characters 
are  taken  from  the  document  in  question. 
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punched  on  IBM  cards  in  the  same  manner  as  the  numeral  samples  used  by 
Baran  and  Breuer.  A  set  of  (10  of)  the  quantized  ideal  characters  are  shown 
in  Fig.  1.32.  Manelis  developed  several  types  of  character  distortion  rou¬ 
tines.  Two  of  these  are  controlled  displacement  routines  whose  purpose  is 
to  shift  the  ideally  centered  character  matrix  a  specified  number  of  columns 
to  the  right  or  to  the  left;  two  others  shift  the  matrix  vertically  by  a  fixed 
amount  at  the  discretion  of  the  experimenter.  The  results  of  applying  these 
programs  to  ideal  characters  are  shown  in  Fig.  1.33.  Manelis  also  devel¬ 
oped  a  random  shifting  routine  whose  purpose  is  to  shift  the  input  character 
by  a  random  amount  horizontally  and  vertically  in  succession.  The  amount 
of  shifting  in  this  routine  is  determined  by  a  sub-program  which  generates 
random  numbers.  These  shifting  routines  simulate  the  effects  of  systematlo 
and  random  errors  which  might  occur  in  the  character-centering  mechanism 
of  a  reading  machine. 

Another  routine  by  Manelis  generates  "random  unbiased  distortion". 
This  Involves  changing  the  color  of  a  specified  percentage  of  the  cells  in  the 
character  matrix.  A  random  number  generating  program  determines  which 
cells  are  to  be  affected.  The  results  of  this  routine  are  demonstrated  in 

1 

Fig.  1.34  for  eighteen  percent  distortion.  In  practioe,  this  type  of  distor¬ 
tion  might  arise  as  a  result  of  such  causes  as  paper  roughness,  electri¬ 
cal  noise  in  a  photocell  scanning  mosaic,  or  typing  through  carbon  P4>er. 
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FIGURE  1.33  (CONT.) 
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acjc  rmnSoa  bxaMd  diatertion^ 


A  final  type  of  distortion  simulated  by  Manelis  is  known  as  random 
biased  distortion  (Fig,  1.35).  A  specified  number  of  entire  rows  and/or 
columns  are  deleted  (i.  e. ,  set  equal  to  zero).  A  random  number  first  deter¬ 
mines  whether  a  row  or  a  colunm  is  to  be  deleted.  A  second  random  number 
determines  which  row  (or  column)  is  to  be  deleted.  Presumably,  distortions 
of  this  type  arise  in  typewriting  when  the  ribbon  has  been  heavily  used. 

By  combining  the  above  types  of  distortion  and  by  varying  the  degrees 
of  the  individual  distortions,  many  of  the  noise  situations  apt  to  be  encountered 
in  practice  can  be  adequately  simulated. 

1.5,5  Review  of  Recent  Literature  Pertaining  to  Property  Significance 

In  this  section,  work  done  by  previous  authors  relating  to  the  prob¬ 
lem  of  finding  significant  pattern-discriminating  properties  will  be  reviewed. 

In  the  literature,  the  problem  has  so  far  been  approached  from  the  stand¬ 
point  of  finding  minimal  relative  code  representations  for  the  unknown 
characters.  In  other  words,  given  a  set  of  prototype  characters,  the 
question  is  asked:  "Taking  into  account  posaible  variationa  due  to  noise, 
by  what  means  can  the  characters  be  coded  so  that  (1)  the  codes  for  different 
characters  will  almost  always  be  distinct,  and  (2)  the  codes  will  be  as  short 
as  possible?"  As  mentioned  previously,  hardware  and  recognition  time  ean 
often  be  reduced  substantially  by  solving  this  problem. 

.The  minimal  representation  problem  Is  closely  related  to  the  prob¬ 
lem  of  determining  the  signlfloanee  of  properties  or  weas  of  the  pattern. 


This  can  be  seen  as  follows:  each  digit  in  the  code  representation  of  a  char¬ 
acter  indicates  the  presence  of  a  specific  property  in  the  character.  It 
follows,  therefore,  that  the  less  significant  the  properties  corresponding  to 
the  code  digits  are,  the  less  information  each  code  digit  will  contain  con¬ 
cerning  the  identity  of  the  character  represented,  and  the  greater  the  code 
length  will  have  to  be  in  order  for  the  characters  to  be  uniquely  represented. 
Moreover,  given  a  non-minimal  coding  scheme  for  a  group  of  patterns,  the 
problem  of  minimizing  the  code  by  eliminating  redundant  or  irrelevant  digits 
is  equivalent  to  the  problem  of  reducing  a  set  of  properties  by  finding  and 
eliminating  the  redundant  and  irrelevent  (i.  e. ,  "non-significant")  ones. 

It  has  been  pointed  out  previously  that  no  generality  is  lost  by 
restricting  the  discussion  to  the  determination  of  significant  areas  of  the 
patterns,  rather  than  significant  properties.  For  simplicity,  we  shall  make 
'this  restriction  in  the  remainder  of  this  report . 

Arthur  Olovazky  (11)  attacked  the  problem  of  determining  minimal 
code  representations  for  noise-free  patterns  satisfying  the  oonditions: 

(1)  The  number  of  patterns  P  is  finite. 

(2)  All  of  the  patterns  are  of  the  "two-tone"  (black-and-white)  variety,  and 

(3)  any  of  the  given  patterns  can  be  divided  into  a  finite  number  of  cells  C, 
each  of  which  is  either  wholly  black  or  wholly  white.  Gloyazl^  re¬ 
stricted  his  discussion  to  C-digit  binary  codes  formed  by  numbering 
the  cellt  in  an  arbitrary  sequence  called  (for  obvious  reasonsl  tbs 
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"■canning  path",  and  then  setting  the  i^  digit  of  the  code  ecfual  to 

i  one*)  "  **  (  bla^l '  G*®^**^  method*, 

•aoh  of  which  enables  one  to  nUnimize  a  given  ecanidng  path  by 
eliminating  superfluous  cells.  The  reduced  numlwr  of  cells  obtained 
is  the  minimum  permissible  number  for  the  given  scanning  path.  It 
is  <|uite  possible  that  a  different  scanning  path  will. yield  a  smaller 
number  of  cells,  and  therefore  a  more  economical  identilldatloD 


Recess. 

Glovazl^'s  first  procedure  involves  the  construction  of  a  cede  mobile. 
a  figure  quite  similar  to  the  "character  trees"  discussed  previously. 


FIGURE  1.36 
The  "Code  Mobile" 


Afypiosl  code  mobile,  for  the  case  1^,  C^,  is  shown  in  Fig.  1,36.  Each 
of  the  nine  "levels"  of  the  mobile  corresponds  to  inspection  of  a  speoifle 
0^1  of  tte  pattern.  Eaoh  "chain"  of  the  mobile  represents  the  results  of 
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scanning  one  of  the  six  patterns  accordii^  to  the  assumed  scanning  path.  At 
each  level,  the  chain  is  diverted  to  the  H  the  pattern  is  f  to 

the  corresponding  cell.  The  code  mobile  shown  therefore  corresponds  to 
patterns  with  binary  codes  as  follows: 


123456789 
1  111010010 

2  011010111 

3  010111010  Table  1*1 

4  100100111 

6  11110  0  10  0 

6  111100111 

The  points  of  Interconnection  of  the  chains  are  called  "nodes".  Hie 
presence  of  a  node  at  a  given  level  of  the  mobile  indicates  that  the  oorre- 
spondlng  cell  in  the  scanning  path  is  relevant  and  cannot  be  eliminated.  U, 
on  the  other  hand,  there  are  no  nodes  at  a  given  level,  then  the  corresponding 
cell  can  be  eliminated  from  the  scanning  path  without  impeding  the  reoog- 
nition  process  in  any  way.  Inspection  of  Fig.  1.36  reveals  that  cells  S,  3, 

6,  6,  7  and  9  can  be  eliminated.  The  resulting  reduced  code  mobile  is  sbown 
in  Fig.  1.37,  and  the  reduced  binaiy  codes  for  the  pstterns  are: 
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12  4  8 


Table  1>8 


The  validity  of  the  above  reduction  can  be  confirmed  by  observing  that  no 
two  rows  of  Table  1-2  are,  identical,  and  that  no  column  can  be  deleted 
without  destroying  the  distinctness  of  the  rows. 

A  necessary  and  sufficient  condition  for  recognition  of  the  patterns 
in  the  minimum  possible  number  of  cells  |logj^P^,  is  clearly  the  following: 
at  each  level  of  the  code  mobile,  each  chain  must  branch.  It  follows  Immed- 
lately  that  at  the  n  level,  there  must  be  2  nodes.  It  is  also  evident  that 
if  P*2''^ ,  where  ^  is  an  integer,  each  cell  in  the  absolute  minimal  scanning 
path  must  be  black  for  precisely  one  half  of  the  characters  and  white  for  the 
remainihg  half.  This  statement  is  approximately  true  when  P^2 


FIGURE  1.37 
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In  practical  problems,  construction  of  the  code  mobile  may  prove 
very  cumbersome  due  to  large  values  of  P  and  C.  Glovazky's  second 
method,  which  is  actually  equivalent  to  the  code  mobile  proc^ure,  is  more 
useful  under  such  circumstances.  The  binary  codes  are  first  arranged  in  a 
sequence  of  descending  numerical  value,  known  as  a  "code  schedule"  (Tfdlile 
1-3). 
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Code  Schedule  for  Table  1-1  and  associated  separation  lines. 

The  procedure  consists  in  finding  the  points  on  the  code  schedule  correspon¬ 
ding  to  the  nodes  of  the  code  mobile.  Proceeding  from  left  to  right,  the 
first  column  is  found  in  which  1  neighbors  0  (column  1  in  Table  1-3).  A 
separation  line  is  then  drawn  between  the  two  digits  extending  from  the 
column  in  question  to  the  right  as  shown.  The  arrays  on  each  side  of  this 
line  can  now  be  regarded  as  new  .schedules.  The  process  is  repeated  (Table 
1-3)  until  all  the  columns  are  exhausted  or  each  of  the  resulting  "subsche¬ 
dules".  contains  only  one  row. 


132 


The  points  at  which  the  separation  lines  b^in  correspond  to  the 
nodes  of  the  code  mobile.  Hence,  all  columns  which  do  not  contain  such 
points  can  be  eliminated,  yielding  again  the  reduced  code  schedule  of  Table 
1-2. 

It  BO  happens  that  the  patterns  represented  by  Table  1-2  can  be 
identified  by  scanning  only  three  cells.  It  can  be  readily  verified  that 
either  cells  3,  4  and  9  or  3,  5  and  9  (in  each  case  one  cell  less  than  the 
minimum  of  4  cells  necessary  for  the  scanning  path  considered  above)  are 
sufficient  to  Identify  the  pattern.  Tlie  problem  of  finding  a  scanning  path 
which  can  be  reduced  to  an  absolute  minimal  scanning  path  is  very  difficult, 
at  least  for  practical  magnitudes  of  P  and  C.  Using  Glovazl^'s  meUiod, 
one  would  have  to  examine  all  possible  cell  sequences,  a  process  far  too 
lengthy  to  be  feasible.  However,  it  can  be  shown  that  the  reduced  scanning 
sequences  obtained  by  Glovazky's  method  are  at  most  P-1  ceUs  in  length. 
Therefore,  for  cases  in  which  C<  '^(P-1),  and  P-1  is  a  sufficiently  small 
number,  Glovazl^'s  method  always  yields  satisfactorily  reduced  cell 
sequences.  Moreover,  one  can  greatly  increase  the  chances  of  finding  an 
absolute  minimal  scanning  path  by  placing  at  the  beginning  of  the  code 
schedule  those  colunms  in  which  the  ratio  between  "ones"  and  "zeros"  is 
closest  to  unity  (1*  •  those  columns  with  the  highest  degree  of  uncertainty. ) 

(10) .  This  method  breaks  down,  however,  whenever  this  ratio  is  close  to 
I  lor  all  columns.  Under  these  circumstances,  unless  the  scanning  path  is 


already  minimal,  one  finda  that  there  is  a  great  deal  of  duplication  of  infor¬ 
mation  by  the  varioqa  cells.  In  other  words,  there  is  a  large  amount  of 
redundancy  in  the  system.  Addition^  rules  for  improving  the  effectiveness 
of  the  code  schedule  technique  have  been  developed  by  O.  Lowenschuss  (17) 
and  l^M.  Pauli  (20). 

Arthur  Gill  (19)  gives  a  rather  lengthy  method  requiring  computer 
processing,  which  always  yields  the  absolute  minimal  scanning  path.  Brief¬ 
ly,  the  process  is  as  follows:  one  begins  by  considering  an  arbitrary  group 
of  P  patterns  composed  of  C  cells  and  the  associated  PxC  "code  .schedule" 
of  the  type  shown  in  Table  1-1.  Assuming  first  that  only  the  pattern  repre¬ 
sented  by  the  first  row  of  the  code  schedule  is  to  be  recognized.  It  is  evident 
that  any  of  the  cells  1  to  C  may  be  selected  as  a  legitimate  path.  Suppose 
now  that  pattern  2  is  added  to  the  set.  Then  each  of  the  paths  selected 
previously  may  or  may  not  be  still  adequate,  depending  on  whether  or  not 
the  part  of  pattern  1  in  that  path  coincijies  wifli  the  part  of  pattern  2  in  that 
path.  If  such  a  coincidence  occurs,  the  path  in  question  has  to  be  augmented 
adding  to  it  another  cell  in  which  patterns  1  and  2  differ.  Carr3ring  out 
all  possible  augmentations  in  this  manner,  a  revised  list  of  paths  is  obtained, 
including  all  paths  adequate  for  recognition  of  patterns  1  and  2.  Next,  pat¬ 
tern  3  is  added  to  the  set,  and  the  augmentation  process  is  repeated,  yiel¬ 
ding  a  list  of  paths  adequate  for  recognition  of  the  first  three  patterns.  Hm 
process  is  continued  until  the  last  row  of  the  code  schedule  has-been  added 
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to  the  set,  at  which  point  a  list  of  all  possible  paths  (consisting  of  at  most  P 
cells)  which  are  adequate  for  recognition  of  the  P  patterns  is  obtained.  The 
soarming  path(8)  containing  the  least  number  of  cells  can  be  easily  found  by 
inspection  of  the  list. 

Gill  has  formulated  the  above  process  in  a  manner  suitable  for  com¬ 
puter  programming.  In  addition,  he  has  developed  a  more  general  procedure 
to  deal  with  the  "noisy"  case  in  which  either  the  patterns  or  the  scanner  is 
imperfect.  In  the  noisy  case,  it  is  desirable  to  maintain  a  miniipum  "dis¬ 
tance"  d  >  1  between  the  codes  representing  the  various  patterns.  In  other 
words,  a  scanning  path  must  be  found  such  that  any  two  patterns  will  differ 
in  at  least  d  of  the  cells  in  the  path.  If  this  condition  is  satisfied,  it  is 
always  possible  to  detect  d-l  errors  or  correct  errors  in  a  given 
pattern.  (12) 

The  procedure  for  finding  a  minimal  scanning  path  consistent  with  a 
prescribed  d— if  such  a  path  exists— is  rather  involved  and  will  not  be  given 
here. 

It  will  be  recalled  that  a  necessary  (but  not  sufficient)  condition  for 
the  recognition  of  2  '  patterns  with  the  absolute  minimal  number  of  cells 
is  that  the  cells  employed  have  the  "binary  division"  property,  i*  •  thb 
cells  must  be  black  for  half  of  the  characters  and  white  for  the  remainder. 
This  fact  was  used  by  Breuer  (3)  as  the  basis  of  his  search  for  significant 
cells.  Inasmuch  as  Breuer  dealt  with  a  set  of  highly  distorted  characters 
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(Fig.  1.31),  U  was  necessary  to  employ  an  integer  threshold  to  determine 
whether  a  cell  should  be  regarded  as  black  for  a  given  character,  white  for 
the  character,  or  neither  black  nor  white.  More  precisely,  Breuer  defined 
a  cell  .  to  be  "almost  always"  black  for  the  character  if  cell  0;j.  was 
black  more  than  (t-x)  times  out  of  a  set  of  t  representative  samples  of  Nj^. 

If  C;^  is  black  in  less  than  f  of  the  samples,  it  is  declared  "almost  alwqrs 
white".  If  neither  of  these  conditions  holds,  the  cell  is  said  to  be  "neutral" 
or  "grey"  In  N^.  Of  course,  the  inequality  T<*/a  '  must  hold. 

Assume  that  K,  the  total  number  of  prototype  characters,  is  even. 
Ilien  in  Breuer's  formulation,  a  cell  is  said  to  have  the  "binary  division" 
property  if  it  is  almost  always  white  for  K/2.  characters  and  almost  always 
black  for  the  remaining  characters.  Breuer  programmed  the  IBM  709 
computer  to  determine,  for  various  thresholds,  which  cells  have  the  binary 
division  property.  Breuer's  routine  also  produces  a  table  in  which  tbs  num¬ 
ber  of  times  each  cell  was  black  for  each  character  is  indicated. 

Breuer  considered  the  possibility  of  constructing  a  direct-logic 
recognition  system  based  upon  "almost  always  white"  and  "almost  always 
black"  cells.  In  such  a  system,  the  "character  tree"  is  constructed  in  foe 

same  manner  as  previously,  with  the  "almost  always  white"  and  "almost 

/ 

always  black"  conditions  replacing  the  "always  white"  and  "alwitys  black" 
conditions  used  then.  At  each  testing  point,  therefore,  a  worst-case 
probability  of  error  ('^/t)  must  be  reckoned  with.  Hie  worst-case  proba- 


I 
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blllly  of  correct,  recognition  after  n  such  tests  is  obviously  T'',  where  ft  r 
It  can  be  seen  that  the  worst-case  correct  recognition  rate  ia  such  a  i^stem 
drops  rapidly  with  Increasing  C  ,  especially  for  large  values  of  n. 

In  order  to  increase  the  reliability  of  the  recognition  process,  Breuer 
proposed  using  redundant  cells  in  various  ways.  For  example,  if  there 
exist  t^  logical  representations  for  a  character  ,  namely  , 

Cj,C^ )  and  ,0^,0^ ),  where  f,  and  T  are  atbitraxy  Boolean 

functions,  tlien  the  recognition  reliability  for  this  character  can  be  increased 

by  testing  all  of  the  cells  C,  ,C. . and  identifying  N^.  according  to  the 

formula  -f  ,Cj^,C3  ,C^  )  U  f,  (CgiC^  ,C,  ,Cj  ).  Breuer  showed  that  this 
scheme  results  in  a  worst-case  recognition  rate  of  C  (Z-Q  )  instead  of 
as  in  the  simple  4-cell  schemes.  Since  I  ,  this  always  represents 
an  increase  in  reliability. 

Of  course,  it  is  not  necessary  for  a  cell  to  have  the  binary  division 
property-  in  order  to  employ  it  in  a  recognition  scheme.  All  that  is  really 
necessary  is  that  the  cell  have  some  discriminating  power.  That  is,  the 
cell  must  be  almost  always  white  for  at  least  one  character,  almost  alwsys 
black  for  at  least  one  character,  and  (possibly)  neutral  for  one  or  mors 
characters.  Such  cells  were  termed  "pseudo-signifioant"  by  Breuer,  and 
provision  was  made  in  the  binary  division  routine  for  detecting  0iem. 

Because  of  the  foot  that  his  character  samples  were  highly  distorted, 
Breuer's  binary  division  routine  yielded  rather  disappointing  results.  8ub- 
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stantial  numbers  of  binaxy-divisioncells  could  only  be  found  when  very  large 
threshold* were  used.  In  Fig.  1.38  the  binary  division  cells  for  T 1  S' 
found  by  Breuer  for  18  samples  (1, 3, ... ,  35)  of  the  IBM  numerals  are  shown. 
The  number  in  each  binary  division  cell  indicates  the  minimum  value  of  T 
for  which  the  cell  satisfies  the  binary  division  criterion.  The  samples  had 
been  filtered  to  remove  noise  and  aligned  with  Breuer's  centroidal  translatioo 
program. 

Figure  1.39  shows  the  "pseudo-signifioant"  cells  found  by  Breuer, 
based  upon  the  same  set  of  samples  as  Fig.  1.38.  The  "binary  division" 
cells  are,  of  course,  automatically  included  in  this  set.  The  maximum 
threshold  value  used  was  T  «3.  The  significant  areas  are  located,  as  one 
might  expect,  away  from  the  border  areas  of  the  field,  which  are  white  for 
all  characters.  The  right*hand  border,  however,  is  excepted  firom  this  rule 
due  to  the  large  amounts  of  non-random  noise  occurring  in  this  region  (Fig. 
1.31). 

Breuer  suggested  that  a  recognition  scheme  (e.  g. ,  his  masking  rou¬ 
tine)  could  recognize  the  characters  by  considering  only  the  highly  significant 
binary-division  or  (if  necessary),  the  "pseudo-significant"  cells.  Presumably, 
ignoring  the  other  cells  would  not  result  in  sufficient  information  loss  to 
impair  recognition.  Due  to  time  limitations,  however,  Breuer  did  not 
actually  test  this  hypothesis  experimentally. 

Because  of  the  high  threshold  values  necessary,  Breuer  did  not  eon- 
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aider  it  worthwhile  to  attempt  recognition  by  means  of  a  direct  loglo  system 
based  upon  the  binary  division  or  "pseudo-significant"  cells. 

Breuer  also  investigated  the  use  of  Baran's  cell  weight  factors 
andXy  (see  Sec. 1.5. 4,  "Baran's  Work")  as  a  measure  of  cell  signifi- 
cance.  Unfortunately,  Breuer's  analysis  is  very  unclear  and  no  conclusions 
concerning  the  validity  of  the  cell  weight  factor  were  explicity  stated.  As  s 
matter  of  fact,  the  cell  weight  factors  are  an  extremely  poor  measure  of 
cell  significance.  For  example,  consider  the  following  two  distributions  of 
black  cells  found  in  t  samples  of  each  of  10  characters: 
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Clearly,  cell  2  has  absolutely  no  significance;  cell  1,  on  the  other  hand,  has 
the  binary  division  property  with  zero  threshold,  and  hence  is  of  great  signi¬ 
ficance.  Nevertheless,  the  black  and  white  cell  weight  factors  for  these 
cells  are  identical.  This  Is  ample  proof  of  the  uselessness  of  the  cell  weight 
factors  as  a  measure  of  cell  significance. 

A  quite  different  approach  toward  the  minimal  representation  prob¬ 
lem  has  been  talcen  by  E.  L.  Blokh  (2).  Blokh  based  his  work  on  the  follow¬ 
ing  assumptions; 

(1)  The  number  of  patterns  K  is  finite. 

(2)  The  patterns  are  undistortcd  and  free  of  noise. 
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(3)  The  patterns  consist  of  C  cells,  each  of  which  may  exist  in  one  of  m 
distinct  states  or  "colors”. 

(4)  Each  pattern  N^-  is  assumed  to  have  an  a  priori  probability  of  ooour> 
renoe  P;  (i*l,2,...,K). 

In  the  minimal  representation  techniques  discussed  above,  the  code 
length  for  each  pattern  was  the  same.  Blokh's  method,  on  the  other  hand, 
permits  non-uhiform  code  lengths  for  the  various  patterns.  A  scanning 
method  is  devised,  if  possible,  such  that  the  expected  value  of  the  code 
lengths  Is  minimum.  This  amounts  to  finding  a  scanning  path  which  permits 
the  most  frequent  patterns  to  be  identified  on  the  basis  of  a  small  number  of 
cells.  Larger  numbers  of  cells  would  have  to  be  scanned  in  order  to  identify 
the  less  frequent  patterns.  Of  course,  to  avoid  oon&sion,  the  code  for  (ms 
pattern  must  never  be  identical  with  the  leading  digits  of  the  longer  oods 
representing  a  less  common  pattern. 

Blokh's  procedure  is  as  follows:  The  uncertainty  in  bits  assoeiatsd 
with  each  cell  o^  is  computed  according  to  die  formula 

ii 

where  p  <JjL )  denotes  the  probability  that  o;  has  the  j-  color.  As  the  first 
cell  k,  of  the  scanning  sequence,  one  chooses  that  cell  having  the  greatest 
uncertainty  value.  In  case  of  a  tie,  an  arbitrary  choice  is  made.  The 
second  cell  kj  is  chosen  such  that  its  conditional  uncertainty  H(k,|  k , )  given 
that  the  color  of  k ,  is  known,  is  greater  than  that  of  all  die  odier  sells. 
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Again,  an  arbitrary  choice  is  made  in  case  of  a  tie.  The  conditional  uncer¬ 


tainties  are  found  according  to  the  formula 

H  (C.  (4,)  =  -  £  ^  l4i^; 


where  p(j^  )  denotes  the  Joint  probability  that  c^  has  the  j  ^  color  and  k’, 
has  the  i^color,  and  p(J^:,  )  denotes  the  conditional  probability  that 

has  the  J  ^  color  given  that  k  j  has  thei-^  color..  The  process  is  continued, 
until  some  number  r  of  cells  have  been  chosen  such  that  H(k^.|  k,  , k,^ , . . . , 
k^,(  )^o  and  H(k^^j|  k^  . k^)=o  for  tdl  choices  of  k^^,.  This  con¬ 

dition  insures  that  the  r  cells  are  always  sufficient  to  identify  the  unknown 
pattern  uniquely.  The  resulting  codes  for  the  patterns  can  be  shortened  ly 
eliminating  trailing  digits,  wherever  possible. 

The  above  procedure  doesn't  always  generate  the  optimum  statistical 
sciuming  path  because  oases  may  arise,  for  example,  in  which 

while  Blokh  claims,  however,  that  In  almost 

all  cases  a  ihinimum  or  near*mlnimum  description  is  obtainable  by  this 
method. 

It  will  be  recognized  that  Blokh's  formulation  of  the  minimum  repre¬ 
sentation  problem  is  similar  to  the  optimum  statistical  coding  problem  of 
information  theory.  A  basic  theorem  of  Shannon  (23)  yields  the  following 
relationship  when  applied  to  the  problem  at  band:  n  ^  'vi, 

where  n,ls  the  average  word  length  of  the  optimum  statistioal  code,  n  Is  the 

P 

average  word  length  of  an  arbitrary  code,  and  the  unoertsln- 
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ty*  (entropy,  Information)  of  the  system  of  patterns,  expressed  in  bits.  It 
is  worth  observing  that  a  necessaiy  condition  for  the  achievement  of  the 
optimum  statistical  code  by  Blokh'a  scheme  Is  that  each  cell  chosen  have 
the  maximum  possible  conditional  uncertainty,  i.  e. ,  each  cell  must  divide 
the  patterns  into  m  equally  probable  groups,  depending  upon  its  color.  For 
the  case  where  m  ■■  2,  and  the  patterns  are  equiprobable,  this  condition 
reduces  to  the  familiar  ,  "binary  division"  property  discussed  previously,  and 
Shannon's  formula  reduces  to  the  familiar  condition  ^  •  Hence 

Blokh's  technique  may  be  regarded  as  an  extension  of  the  previous  techniques 
to  the  case  of  unequally  probable  patterns. 

A  second  serious  shortcoming  of  the  method  is  its  lack  of  provision 
for  dealing  with  distortion  and  noise.  It  is  still  possible  to  compute  Blokh's 
cell  significance  criterion,  H(c),  when  noise  is  present.  However,  it  will 
be  shown  in  Section  1.6  that  this  criterion  is  no  longer  valid  in  the  noisy  case, 
This  limits  the  applicability  of  the  Blokh  teohnique  in  practice  to  those  rela¬ 
tively  rare  situations  in  which  the  noise  level  is  insignificant. 

In  Section  1,6  of  this  report  and  Chapter  III  of  UCLA  Report  62-68, 
an  alternate  procedure  for  finding  significant  cells  when  noise  is  present 
will  be  given. 


The  various  names  given  for  the  fiinction  are  all  in  common  use. 
For  uniformity,  the  term  uncertainty  will  be  used  throughout  this 
report. 


144 


j  g  The  Conditional  Uncertainty  Criterion  for  Cell  Significance 

In  this  section,  a  measure  of  cell  significance  will  be  introduced  and 
Justified  on  the  basis  of  intuitive  ideas  concerning;  the  notion  of  "significance". 
A  few  such  ideas  have  been  discussed  in  the  introduction,  where  qualitative 
observationa  were  made  concerning  the  effects  of  noise,  variations  of  a  priori 
probabilities,  etc.  upon  the  significance  of  cells.  In  Section  1.6.2  of 
this  chapter,  these  qualitative  observations  will  be  made  more  precise. 

Where  possible,  it  will  be  shown  analvtloally  that  the  significance  measure 
introduced  here  conforms  to  the  Intuitive  requirements.  The  final  justifica¬ 
tion  for  the  significance  measure  is  experimental  and  is  described  in  CSiap- 
ters  in  and  IV  of  UCLA  Report  62-68. 

1,6,1  The  Conditional  Uncertainty 

Let  us  suppose,  as  in  Section  1.5. 4,  that  we  are  dealing  wlfii  a  setof 
characters  N , ,  , . . . ,  having  a  priori  probabilitieB  of  occurrence 

p(N, ), pfN, ) . P(Nk)  respectively,  and  that  the  characters  are  in  fi>e  form 

of  binary  digit  matrices  containing  IxJ  cells  c  j|(Fig.  1 . 19) .  Let  H(M)  ■ 

denote  the  uncertainty  (measured  in  bits)  in  the  eystem  N 
consisting  of  the  events  , . . . ,  N,.  (the  "event  "  signffles,  of  course, 

the  occurrence  of  character  If' ).  On  the  basis  of  a  large  nunriber  ct  repre¬ 
sentative  samples  of  each  character,  it  is  possible  to  compute  (e.g. ,  by 
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using  Baran'a  Recognition  technique*)  the  a  posteriori  probtdiiUties 
p(N^  )**  and  p(N^/b;^  )**  for  each  cell  C;  j  and  each  character 
It  is  therefore  also  possible  to  compute  ttse  conditional  uncertainty  in  tiie 
system  N,  giver:  that  G  has  been  observed: 

I-/ 

The  smaller  the  value  of  H(N/ C;^ ).  -die  smaller  the  average  uncertainty 
concerning  the  identity  of  the  unimown  character  is  after  C  has  been 
observed.  It  is,  therefore,  natural  to  make  the  following  claim:  the  smaller 
H(N/C  )is,  the  greater  the  aignifloanoe  of  is. 

Using  similar  statistioal  techniques,  the  process  can  be  continued  to 
determine  the 'significance  of  a  group  of  q  cells.  We  have: 

H  |c, ,  ^  [<Cf 

where  the  variables  \J;^|  take  on  the  values  and  •  and  the  sum  in 
brackets  is  taken  over  all  possible  resulting  values  of  the  vector 
Again,  it  is  asserted  that  the  smaller  H(N/C^,  ,  , . . . ,  )  is,  tbs 

greater  the  significance  of  the  group  of  cells  C„|  ,0.,^  is.  The 

probabilities  P(N^  and  P()^  /'^  can  be  satisfactorily  approximated  by 
formulas  such  as  (I-l),  provided  that  the  cells  C^,  ,  are  sufr 

ficiently  independent.  The  conditional  uncertainly,  given  the  informatlcn  con¬ 
tained  in  a  group  of  q  cellSi  will  be  referred  to  as  a  qth-order  conditional . 
uncertainty. 

*  8ectlonl.S.4. 

**  The  notation  introduced  in  Section  1.5.4  is  adhered  to  in  this  section. 
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It  is  clear  that  the  computational  difficulties  increase  very  rapidly 
with  increasing  q.  In  fact,  there  are  K*2^  terms  in  the  ejqiression  for  tiie 
qth  order  conditional  uncertainty.  Moreover,  the  individual  terms  become 
increasingly  more  difficult  to  compute  as  q  increases.  For  this  reason,  it 
is  not  practioal  to  compute  conditional  uncertainties  for  orders  higher  than 
the  first  few.  An  alternate  approximate  procedure  for  finding  significant 
groups  of  oells,  without  incurring  the  computational  difficulties  of  higher 
order  conditional  imcertalnties,  will  be  presented  in  Section  1.6,3. 

It  is  evident  that  Blokh's  criterion  H(c)  for  finding  significant  cella 
is  similar  in  some  respects  to  the  conditional  uncertainty  criterion  H(N/o) 
Introduced  here.  It  will  now  be  shown  that  the  two  criteria  are  equivalent 
in  the  absence  of  noise,  but  that  in  the  noisy  case  only  the  criterion  given 
here  is  reliable.  Consider  the  following  basic  relationship  between  Joint 
and  conditional  uncertainties:* 

H(c,N)=  H(c)^  H{NJ)  f  H(c  N) 

In  the  noiseless  case,  H(c/N)«0  since  the  oolor  of  any  cell  o  is  uniq¬ 
uely  determined  onoe  the  identity  of  the  character  is  given.  Henoe,  for  ideal 
oharaoters,  H(c,  N)-H(N/c)+H(c)*H(N)*  constant.  Clearly,  the  large^.H(o)  is, 
the  smaller  H(N/o)  must  be,  and  vice  versa.  Blokh  associated  cell  signifi¬ 
cance  with  large  values  of  H(o),  whereas  small  values  of  H(N/e)  are  used  in 

this  stutty  as  a  criterion  of  cell  significance.  Hence,  Ihe  twc  criteria  are. 

indeed,  equivalent  when  noise  is  absent 
*  See  Feinstein  (7),  pp.  U 
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When  noise  is  present,  on  the  other  hand,  large  values  of  H(o) 
be  due  to  the  noise  rather  than  to  actual  significance  of  the  cell  c.  Under 


these  conditions  the  conditional  uncertainty  criterion  H(N/c)  is  a  far  more 
reliable  measure  of  cell  significance,  for  it  measures  directly  the  pertinent 
quantity,  i.  e.  the  uncertainty  in  the  identity  of  the  unknown  pattern.  To 
illustrate  this  point,  suppose  that  cell  c  is  ideally  white  in  each  of  the  pos¬ 
sible  characters.  Let  us  now  assume  the  following  noise  distribution:  for 
each  character,  there  is  a  probability  of  1/2  that  cell  c  will  appear  black 
rather  than  white.  Since  this  type  of  noise  is  Independent  of  the  character 
to  be  recognized,  cell  c  still  yields  no  information  concerning  the  Identity 
of  the  unknown  character,  and  hence  should  be  given  the  lowest  possible  sig¬ 
nificance  rating.  However,  H(c)  ■  1  bit  since  there  is  an  equal  probability 
of  0  being  white  or  black.  But  this  is  the  maximum  possible  value  for  H(o) 
in  a  two-tone  pattern,  and  hence,  according,  to  Blokh's  criterion,  cell  o 
would  be  falsely  considered  maximally  significant.  On  the  other  hand,  it  is 
readily  computed  that  with  the  noise  present  and 

ffflifi'  •  whence  H f f'i 

HO/)  .  But  H(N)  is  the  maximum 

possible  value  which  H(N/o)  can  assume  (see  next  section).  Therefore, 
according  to  the  H(N/o)  criterion,  cell  c  would  be  given  the  lowest  possible 
significance  ranking,  as  it  should  be.  This  rather  extreme  example  tends 
to  support  the  conjecture  that  H(N/o),but  not  H(o),i8  a  reliable  measure  of 
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cell  significance  in  the  presence  of  noise.  More  will  be  said  dMNit  this  in 
file  next  section. 

,  6.2  Justifloation  Of  the  Conditional  Uncertainty  Criterion 

In  this  section  it  will  be  shown  that  the  conditional  uncertainfy  fiino* 
tion  has  certain  properties  which  tend  to  Justify  its  use  as  a  measure  of  oell 
significance.  First,  however,  it  Is  convenient  to  introduce  the  following 
notation: 

(1)  8(c)  -  H(N)  -  H(N/o) 

(2)  8{c;/cp-H(N/c^)-H(N/o;^Q,) 

(3)  S(c^,  ,c,, . 0^)  -  H(N)  -  H(N/c, . 

where  H(N),  H(N/o),  etc.  are  as  defined  in  the  preceding  seotiem.  The  Amo¬ 
tion  S(c)  will  be  regarded  as  a  numerical  measure  of  the  signifioanoe  of  the 
oell  0.  Similar  remarks  hold  for  the  functions  8(0^  /o^ ),  8(04,1  , 

o«,^ ),  etc.  Clearly,  the  above  definitions  are  in  acoonlanoe  with  tiw  previ¬ 
ous  assertion  that  cell  signifioanoe  is  a  decreasing  Amotion  of  the  appropri¬ 
ate  conditional  uncertainty.  Moreover,  the  Amotion  8(0)  possesses  the  fol¬ 
lowing  property  which  one  would  naturally  require  of  a  valid  measure  of 
signifioanoe: 

Property  A.  8(0)  i.  o,  with  equidity  if  and  only  if  iKN^/w^  )  -  p(N^  /b^  )  >: 
p(I^ )  for  all  k.  * 

*  this  oonditi<m  for  8(0)  ■  0  is  of  oourse  preoisely  the  olroumstanoe 
under  whioh  one  would  sqy  "oell  0  yields  no  infbrmatimi  oonoemliv 
the  identity  of  file  unknown  oharaoter".  In  ofiior  words,  8(0)  ■  0  porr 
respmds  to  the  ease  of  aero  oell  signifioanoe,  as  it  should. 
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Proof;  This  property  is  an  immediate  consequence  of  Shannon's  Fundamen¬ 
tal  Inequality  which  states  that  H(N/c)  <  H(N)  when  particularized  to  the  case 
considered  here.  Furthermore,  it  is  shown  in  Feinstein  (7),  pages  (15-16), 
that  the  condition  for  equality  is  as  stated  above. 

The  following  property  of  the  significance  function  is  a  special  case 
of  the  effects  of  new  Information  concerning  the  unknown  character  upon 
previously  estimated  cell  significance  ratings. 

Property  B.  The  quantity  S(oj  /c^  )  denotes  the  significance  to  be  attributed 
to  o; ,  on  the  assumption  that  has  been  observed.  If  the  cells  o;  and  o^ 
are  highly  correlated,  t.  e.  if  the  state  of  oell  c^  can  be  predicted  wifii  great 
certainty  if  that  of  oy  is  known,  and  vice  versa,  we  should  have  8(0;  /o^ )  ^ 

0  and  S(o| /o; )  <(  0.  In  partioular,  S(o/o)  ■  o  for  any  oell  o. 

Proof;  It  is  sufficient  to  show  that  H(N/o^ )  ■  H(lVo^ ,  ).  Assume  for 

definiteness  that  o;  is  almost  always  white  when  Oj  is  white,  and  vice  versa. 
Then,  =  ^  •  Also, 


Similar  relations  hold  if  '*w"  is  replaced  "b”  in  the  above  formulas. 
Hence  H(h/ 

completing  the  proof. 


150 


The  dependence  of  the  cell  elgniflcanoe  function  upon  the  t  priori 
probabilities  has  also  been  discussed  in  the  introduction.  The  followliig 
property  is  stated  as  an  example  of  the  dependence. 

Property  C.  Suppose  that  cell  o  is  alwi^s  black  in  N,and  alwsys  white  in  the 
remaining  characters.  In  other  words,  the  significance  of  o  depends  upon 
the  fact  that  this  cell  differentiates  between  N,  and  the  remaining  characters. 
Similarly,  suppose  that  cell  o^  is  alw^rs  black  in  ,  always  white  in  flie 
remaining  characters,  and  that  1/2  >  p(N , )  >  p(N2 ).  Then  S(o^ )  >  S(o^. ). 
Proof:  It  is  necessary  to  show  that  H(N/o,  )<  H(N/o2  )  under  the  stated  con¬ 
ditions.  We  begin  with  the  relations* 

H(N,  c,  )  -  H(N)  +  H(C,  /N)  «  H(N/c,  )  +  H(o, ) 

H<N,  Oj, )  «  H(N)  +  H(Ci/N)  -  H(N/Ot )  +  H(0*) 

Under  the  conditions  of  the  hypothesis,  it  is  clear  that  H(0|  /N)  ■  H(o^/N)*0, 
Therefore,  we  have  H(N)  ■  H(N/o, )  +  H(c, )  ■  H{N/Cj.)  +  H(o^).  •  The  proof 
now  reduces  to  showing  that  11(0}  )<  H(c, ).  This  is  quite  simple;  .H(e| )  is 
the  uncertainty  of  file  2-event  ^stem  whose  outcomes  are  ”o  black"  or  "o 
white".  The  probabilities  of  these  events  are  p(b, )  *  p(N, )  and  p(W| )  ■ 

P(Ni )  +  P(N3  )  +  . . .  +P(Nn )  •  I-pCNi  )  respectively.  Similar  statements 
apply  to  H(02  ).  Referring  to  the  2-event  uncertainty  curve  shown  in  Figure 
1.40,  it  is  seen  that  the  uncertainty  functim  is  monotonically  increasing 
for  Oj  p^$  1/2.  Since  1/2  >  p(N,  )>  p(Ni )  by  hypothesis,  it  can,  indeed,  be 

*  Khinchin(i6},  pw  6. 


151 


concluded  that  H(Cj)<H(C|  ),  completing  the  proof. 

Property  C,  as  stated  above,  can  be  generalized  considerably.  First 
of  all,  the  condition  that  c ,  be  black  in  H|  (rather  than  white)  and  white  (rath¬ 
er  than  black)  in  , . . . ,  N|(  is  clearly  arbitrary.  The  same  statement 

applies  to  the  color  of  Oj^.  Moreover,  the  hypothesis  that  p(N^)<  p(N,  )&.  1/2 
can  be  replaced  by  the  more  general  hypothesis  that  max  ^  p(N, ),  l-pfS/ 
max  ^P(N^),  l-p(N2,)J ,  since  this  is  a  sufficient  condition  for  R(0|  )  >  H(o^). 


FIRURE  1.40 


The  following  property  is  far  stronger  than  those  so  far  considered. 
Any  significance  function  satisfying  it  must  be  Immediately  accepted  as  valid. 
Unfortunately,  it  is  not  possible  to  prove  diat  the  significance  function  consid¬ 
ered  here  has  this  property  except  in  a  rather  restricted  sense. 

Property  D;  If  S(0|  )>  Sfc^),  then  the  average  error  rate  in  the  identification 
of  the  unknown  character  will  be  lower  if  identification  is  based  on  the  set  of 
probabilities  p(N|^/c , )  then  if  it  is  based  on  the  set  of  probabilities  p(N^  /c^ ). 
A  similar  statement 'applies  if  S’(c^|rO„i  ••••)>  ••••)• 
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Remarks:  In  the  Baran  Recognition  scheme,  the  unknown  character  la  alwaya 
Identified  as  the  character  having  the  greatest  a  posteriori  probability.  This 
maximum  a  posteriori  probability  is,  therefore,  approximately  equal  to  the 
probability  that  the  unknown  character  is.  Indeed,  the  character  that  it  Is 
identified  as.  In  other  words,  the  average  correct  identification  rate  is  ap¬ 
proximately  equal  to  the  average  value  of  the  maximum  a  posteriori  probabili¬ 
ty.  In  view  of  this,  and  recalling  the  definition  of  S(o),  Property  D. can  be 
restated  as  follows: 

Let  C(H)  denote  the  class  of  complete  K-event  systems  |^p,  ,p^  • . . . , 

p^j  having  the  uncertainty  H.  Suppose  that  the  probabilities  ^p^  . 

p^j  obey  a  K-varlate  probability  density  function  '/'(p j , p,  , . . .  ,P|(  ).  Then, 
according  to  property  E,  if  H2>  H, ,  then  E(Pm|)^  where  E(p^,) 

denotes  the  e^qmoted  value  of  the  maximum  probability  of  the  K-event  systems 
C(H| ),  and  E(p«,2.)  denotes  the  e;q>eeted  value  of  the  maximum  probability  of 
the  systems  C(H2).  The  expected  values  are,  of  course,  given  by  the  formu¬ 
las:  f  (pm,) -J'  •  •/'*«*»  ‘ 

It  is  easy  to  see  that  Property  D  holds  for  certain  trivial  oases,  e.  g. 
when  K  ■  2.  For,  onoe  H  is  specified  for  a  two-event  system,  the  two  proba¬ 
bilities  are  uniquely  determined.  From  Figure  1.40  ,  however,  it  is  evident 
that  the  maximum  of  the  two  probabilities  is  a  decreasing  fimotion  of  H. 
Therefore  Property  D  is  satisfied.  Property  D  must  also  hold  if  H  assumes 
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U>  (>) 

Its  maximum  possible  value,  log  ^  K,  for  then  p  ■  1/K  ^  /V 

fwhere  p  •  denotes  the  maximum  probability  possible  in  a  system  having 

» 

unoertalnty  and  the  saperscrlpts  J  Indicate  that  the  probabilities  are 
from  a  system  having  the  uncertainty  Hp.  If  H  ^  <  log  ^  K,  then  at  least  one 
of  the  's  must  be  larger  than  1/K  ■  p^^,  which  proves  the  assertion.  Simi¬ 
larly,  Property  D  holds  in  case  Hj  ■  0,  for  then  p^j  -  I  > 

For  K  >2,  >  0,  and  <  log^  K,  file  situatton  is  quite  oompllos- 

ted  since  the  probabilities  p!^  and  ^  are  by  no  means  uniquely  deter¬ 
mined.  For  some  configurations  of  the  probd>ilities  and  p*,^^  ,  it  may 

^  JL 


indeed  occur  that  the  maximum  p 


4Ha. 


of  the 


exceeds  the  maximum  a 


of  the  ];f*^  even  though  H I  <  Hj,.  If  property  D  holds,  however,  fids  will 
not  occur  "on  the  average". 

In  order  to  gain  some  Insight  into  the  relationships  between  the  uncer¬ 
tainty  in  a  system  and  the  maximum  probability,  it  is  worthwhile  to  consider 
first  the  relatively  simple  case  where  K  ■  3.  Consider  a  three-dimensional 
Cartesian  coordinate  system  with  axes  labeled  p^  ,  p^  ,  and  p^  (Fig,  1,41), 
The  various  possible  probability  configurations  for  a  system  consisting  of 
three  events  can  be  represented  by  points  in  fiiree-dimensional  spacer  Clearly, 
for  complete  systems  these  points  lie  on  the  plans  p^  -t-p^  ■  1.  More¬ 
over,  since  each  probability  must  be  positive,  the  points  are  restricted  to 
the  triangle  ABC  of  Figure  1,41 .  The  lines  OD,  OF,  and  OE  are  segments 
of  file  perpendicular  bisectors  of  the  sides  of  triangle  ABC.  It  is  evident  that 
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FIGURE  1.41  in  Region  I,  P|  is  the  dominant 

probability  (i.  e.  p^^  and 
p  >  p  );  similarly,  in  Regions 

I  i 

n  and  m,  p^  and  dominate 
respectively.  On  the  line  OE, 

P,  =  Pj  .  On  OD,  Pj  -  p^  , 
and  on  OF,  ■  p^  *  At  the 
point  (X  of  course,  p,  " 

Pj  -1/8. 

For  each  point  on  triangle  ABC  there  is  an  associated  uncertainty 
value  H  ■  W  ^  p  •  log  p  •  {  .  A  sketch  of  the  oorresp(mdlng  surface  la 
ahown  In  Figure  1.42.  Hie  point  of  maximum  uncertainty  lies  at  the  oeao 
ter  O  of  the  triangle  (i.  e.  the  point  (1/3, 1/3, 1/3) ).  The  edges  AB,  BC,  and 


AC  of  the  triangle  oorrespond  to  points  for  which  p ,  -  0,  p  ^  -0,  or  Pj 

I’ 

respectively.  Accordingly,  die  oonditionBl  uncertainty  eurfaoe  follows  tb» 


FIGURE  1.42 


6 


familiar  two>dimensional  unoertainty 
ourve  (Fig.  1.40)  above  these  lines. 

The  loous  of  points  on  triangle 
ABC  having  a  given  unoertainty  value 
is  known  as  a  "level  curve"  of  the 
unoertainty  function.  Evidently,  the 
level  curve  corresponding  to  a  speol- 
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fled  value  of  H,  say  H«,  is  simply  the  intersection  of  the  uncertainty  surface 
with  a  plane  parallel  to  ABC,  and  at  a  distance  d  «  Hq  from  ABC.  In  Figure 
1.43,  level  curves  corresponding  to  various  values  of  H  (and  d)  have  been 

sketched.  When  d  >  log^^S,  the 
plane  does  not  intersect  the  uncer¬ 
tainty  surface,  and  no  curve  is 
formed.  This  corresponds  to  the 
fact  that  H  cannot  exceed  log  3 
bits  in  a  system  consisting  of  three 
events. 

When  d  "  log^  3,  the  level  curve  consists  of  the  single  point  Q.  As 
d  decreases,  closed  curves  such  as  1  and  2  are  formed.  When  d  ■  1  (R  •  1 
bit),  curve  3  is  formed.  This  curve  touches  the  edges  of  triangle  ABC  at 
their  midpoints.  This  is  in  agreement  with  the  fact  that  1  bit  is  the  maxi  - 
mum  possible  xmcertalnty  in  a  two-event  system.  Level  curves  correspond¬ 
ing  to  values  of  H  less  than  1  bit  consist  of  three  segments  (e.  g.  curve  4). 
Finally,  when  d  ■  H  ■  0,  ttie  level  curve  consists  of  the  three  points  A,  B, 
and  C  only.  This  is  in  accordance  with  the  fact  that  at  these  three  points, 
the  probability  configurations  (p,  .p^  rP^  )  are  (0,1,0),  (0,0,1),  and  (1, 0,0) 
respectively,  so  that  there  is  indeed  no  uncertainty. 

Level  curves  of  the  probabilities  p,  ,P2  ,  and  p^  can  also  be  con¬ 
structed  on  triangle  ABC.  hi  Figure  1.44,  level  curves  of  have  been 
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drawn.  These  are  simply  a  series 
of  straight  lines  parallel  to  AB.  The 


value  of  P|  corresponding  to  a  given 
level  curve  is  proportional  to  the 
distance  between  the  level  curve  and 
AB.  Entirely  analogcus  statements 
apply  to  the  level  curves  of  p^  and 
p  ,  ,  which  are  straight  lines  psral- 
lei  to  BC  and  AC  respectively. 


Property  D  can  now  be  discussed  geometrically.  Given  the  two  oom- 


plete  3-event  systems  P  and  Q.  with  H(P)  <  H(Q),  we  begin  Ity  constructing 


the  level  curves  corresponding  to  H|  ■  H(P)  and  ■  H(Q)»  as  in  Figure 
1.45.  According  to  Property  D,  if  one  examines  a  large  sample  of  proba¬ 


bility  configurations  which  are  elements  of  C(H| ),  and  averages  the  nund- 


FIGURE  1.45 


mum  probabilities,  the  result  will 


A 


Small  ciSclcs  locatiph/s 

OF  end 


be  greater  than  the  corresponding 
average  taken  over  a  large  represen¬ 
tative  sample  of  elements  ). 


Alternately,  if  Property  D  bolds, 
the  line  integral  ‘ 

must  exceed  the  corresponding 
integral  taken  over  Ibere- 
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fore,  in  order  to  determine  if  Property  D  holds,  it  is  necessszy  to  evaluate 
line  integrals  such  as  I,  for  various  values  of  H,  .  Before  these  computa¬ 
tions  can  be  carried  out,  however,  ^  (p^  ,p^  ,p^  )  must  be  known.  It  can 
be  seen  that  in  general  this  density  fimction  will  depend  upon  the  a  priori 
probabilities  of  the  characters,  the  amount  of  noise  in  the  characters,  the 
way  the  noise  is  distributed,  and  other  variables  as  well .  For  this  reason, 
it  does  not  seem  possible  to  make  any  general  statements  about  the  validity 
of  Property  D.  Instead,  the  following  weaker  property  of  the  slgnifioanoe 
function  will  be  proved. 

Property  pf  Let  C(H)  again  denote  the  class  of  complete  K-event  systems 
^  t  }  having  uncertainty  H,  and  let  p^  denote  the 

maximum  of  the  probabilities  {p*tP^  *  *  *  *  •  P*k  }  ‘  denote  the 

maximum  of  the  probabilities  ( p^^} ,  where  of  runs  through  C(H).  Assume 
that  H,  <  .  Then  X(H, )  >  X(Hi  ). 

Preliminary  Remarks;  It  is  readily  verified  from  Figure  1 . 45  that  Proper¬ 
ty  D  holds  in  the  3-dlmensional  case.  The  reader  will  reoall  that  the 
curves  £> ,  and  £2.  represent  level  functions  of  H  oorresponding  to 
the  values  H  •  H  |  and  H  ■  respectively,  where  >  H| . 
Restricting  attention  to  Region  I  of  the  figure,  we  observe  that  the 
point  m, corresponds  to  the  probability  oonfiguratimi  in  which  p^ 
assumes  its  maximum  possible  value  oonsistent  with  the  restriction 
that  the  uncertainty  of  the  system  be  H,  .  This  must  be  so,  since 
the  level  curve  jB,  (of  P|  )  which  passes  through  m |  is  Airther  from 
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AB  than  any  other  level  curve  of  P|  which  meets  ^  |  •  By  symmetry, 
the  value  of  P| .  at  m  ,  is  identical  with  the  maximum  values  attainable 
by  p^  and  p^  subject  to  the  restriction  that  the  imcertainfy  of  the 
system  be  H ,  .  Therefore,  this  value  of  P|  is  equal  to  X(H ,  ). 
Analogous  statements  apply  to  the  point  m^  on  •  Since  m^is  closer 
to  AB  than  m,  is,  it  follows  that  )<  X(H ,  ),  as  it  should  be. 

The  same  argument  can  be  applied  to  any  two  distinct  level  curves; 
thus,  Property  Ti  is  satisfied  in  the  3-dimenstonal  case. 

As  an  adjunct  to  Property  D'i  the  following  statement  concerning* 
the  minimum  N(H)  of  the  p^^  ,  subject  to  the  constraint  that  q€  C(H), 
can  be  made. 

Property  p"  If  H,  <  ,  thenN(H,  )  >N(Hj  ). 

Unfortunately,  the  author  has  not  succeeded  in  proving  this  in 
general.  Referring  to  Figure  1.45,  we  observe  that  n,  and  n^ 
are  the  points  corresponding  to  the  probability  oonfigpurattons  in  which 
Pj  *N(H,  Jandpj  ),  respectively.  Obviously,. n,  cor¬ 

responds  to  a  larger  value  of  p,  than  n^  does,  as  required  by  Pro-, 
perty  P.  Applying  the  same  argument  to  any  two  dlstinot  level  curves, 
we  conclude  that  Property  D^is  true  in  the  3-dimensional  case.  Inci¬ 
dentally,  it  can  be  seen  that  Property  D*  is  a  consequence  of  Property 
l/in  the  3-dimensional  case,,  as  follows.  From  Figure  1.45  ,  it  is 
evident  that  if  Property  1)  is  satisfied  by  two  distinct  level  curves, 
but  Property  p'^  were  not,  the  two  level  curves  would  have  to  inter- 
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sect  somewhere.  But  that  is  impossible  since  the  uncertainty  value 


of  any  point  on  triangle  ABC  is  unique.  Hence  Properly  D  implies 
Property  d' in  the  3-dimensionat  case.  Perhaps  this  argument  can 
be  extended  to  higher  dimensions  as  well.  Then,  in  view  of  the  gene¬ 
ral  proof  of  Property  D'^ given  below,  we  would  have  a  general  proof 
for  Property  D* 

Proof  of  Property  pf  There  exists  some  ot  €  C  (H)  such  that  X(H)  * 

w*<|p  I  ,p  ^  . P/?  J  *  definiteness  that  X(H)  •  p ^  .  The 

first  step  of  the  proof  consists  in  showing  that  p^  "p*  *. . .  ■ 

Holding  H  constant,  we  now  wish  to  find  the  values  of  p,  ,  p ^  •  •  •  •  • 
for  which  p^  is  maximum.  For  this  purpose,  Lagrange's  Method  of  Multl- 


Butthis 


pliers  will  be  emplc^ed.  We  have  the  two  equations 
K  ^ 

(1)  I-"  '^fi,  (to be  maximised) 

£  ®  (constraint  equation). 

Multiplying  (2)  by  \  ,  differentiating  the  result,  and  adding  this  to  the 
differential  of  (1)  yields:  \  T ^ 

Hence  >.  =  - 5 - h — 7  i  «  .  -fw  A  *  2  •  •  • .  K  . 

implies  that  ~  « •  ■  • « started  out  to  prove 

Now  we  can  prove  Property  as  follows:  We  can  regard  a  K-event  system 

*>  ■  {Pj*  *Pt  •••••Pk)  ^  C(H)  as  a  two-event  system.  P  having 

the  possible  outcomes  "outcome  #l"(with  probability  p|*'  ),  or  "not  outcome 
#1"  (with  probability (1-p^ ),  followedby  the (K-l)-event system  Y  =  {■  , 

•  •  • » r  If  Is  easy  to  show  that  H  *  "^P^logi  pf*  +(1“P*  )log  (l-pj^ ) 
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+  (l-p*)H^  ■  +(I-p*  )H^,*  where  andH^  denote  the  uncer¬ 

tainties  of  the  systems  ys  and  Y  respectively.'  Therefore,  for  the  system 

I 

OL  t  we  can  write: 

H  -  -  f  p  /  log^  +  (1-p  )log  j,  (l-p,*"' )  -h 

<'->'rV£(pr)'tJ^)j  -f/’o*,  ^r'  Hi-pr')iog.  a-p.*')* 
(t-pf )  iog^(jyi- 

Now,  considering  H  as  a  function  of  p*  and  differentiating,  we  find: 

•3^'  —  [log^pJ‘+ I  -logjO*^)- »-log/^)]  Clearly, 

^  -O  when  hence  (k-l)D  ^  -  l-p*  .orp**  -  • 

Ifp*  ^  ^  •  It  Ifl  readily  verified  that  <  I  ,  Since  p*  ^ 

JH  L  • 

always,  it  follows  that  ^0  always,  i.  e.  H  is  a  strictly  decreasing 

I  •<* 

ftutetion  of  p  ^  ■  X(H)  for  I .  This  completes  the  proof  of  Property 

d'. 


1.6.3  Correlations  Between  Cells. 

In  the  first  section  of  this  chapter  the  difficulty  of  computing  higher- 
order  conditional  uncertainties  was  pointed  out.  Therefore,  in  order  ta 
test  the  conditional  uncertainty  criterion  experimentally,  it  is  necessary  to 
make  use  of  an  alternate  procedure  for  finding  significant  sequences  of  cells 
which  is  based  upon  Property  B  of  the  significance  function.**lty  means  of 
this  procedure,  it  is  possible  to  find  highly  significant  sequences  of  cells 
without  having  to  compute  higher-order  conditional  uncertainties.  The  pro- 


See  Feinstein  (7),  p.  5,  Lemma  3. 
Refer  to  the  preoe^ng  section. 
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eedure  is  as  follows: 

(1)  The  first-order  conditional  uncertainty  H(N/C^‘  )  is  computed  for 
each  cell  in  the  character.  The  cell  C,  having  the  smallest 
conditional  uncertainty  value  is  noted  (this  cell,  it  will  be  recalled, 
is  the  most  significant  one).  *  A  certain  number  of  cells  having  very 
high  conditional  uncertainties  can  be  eliminated  from  consideration 
immediately. 


(2) 


The  correlation  ooelEflcient  =  JH 


a.  Ot 


between  C,  and  each 


cell  C^-  not  yet  eliminated  la  computed  according  to  the  formulas 

where  E(x)  denotes  the  average  value  of  Uie  variable  x  based  upon  a 
large  number  of  representative  samples  of  all  flie  obaraoters  (pro¬ 
perly  weighted  according  to  the  assumed  a  priori  probabilities);!  Is 
a  variable  whose  value  is  ■I’l  when  C;  is  black,  -1  otherwise;  s 
+1  or  -1  according  as  Cy  is  black  or  white.  The  Cell  Cs  with  res¬ 
pect  to  which  the  correlations  are  taken  will  be  called  the  "pivot" 
cell. 

It  is  s  well  known  fact  that  ~l  <  p4s<'*’l«  If  ±  |  ^  that  is,  if 
C^<  and  Cj  are  "highly  correlated",  then  I  *  ±.S  almost  always;  Hence, 
C/  and  Cj  yield  essentially  the  same  information.  According  to  Property 
B,  therefore,  s(C/  /C  5  )  •  0  for  any  cell  C  •  which  is  highly  correlated 


In  case  of  a  tie,  an  arbitrary  choice  should  be  made. 


162 


with  C  5  •  Hence,  if  [piyf  "1,  it  can  be  concluded  immediately  that 
cell  C^'  is  rendered  insignificant  by  observation  of  Cj  .  Any  sequence  of 
significant  cells  beginning  with  C ^  would  therefore  not  contain  cell  C  i  . 
All  such  cells  can  therefore  be  eliminated  from  further  consideration. 
All  cells  surviving  the  second  sieving  must  also  have  jsurvlved  the  first 
sieving  process,  which  was  on  the  basis  of  conditional  uncertainty.  There¬ 
fore,  these  cells,  can  be  regarded  as  both  significant  gnd  independent  of  C5  . 
If  the  first  two  slevlngs  have  been  sufficiently  severe,  that  is,  if  enou{^ 
cells  have  been  eliminated  each  time,  the  number  of  remaining  cells  will 
be  small  and  an  acceptable  reduced  scanning  path  will  have  been  achieved. 

If  it  is  desired  to  eliminate  more  cells,  a  second  sieving  process  can  be 
carried  out  by  selecting  a  second  "piVot"  cell  Cj  firom  among  fiiose  not 
yet  eliminated,  computing  correlation  coefficients  between  it  and  all 
remaining  cells,  and  eliminating  all  cells  for  which  / .  The  pivot  cell 
c/  can  be  selected  by  one  of  the  following  simple  methods,  each  of  which 
require  no  further  computations 

(a)  Choose  as  that  cell,  distinot  from  C  $  ,  having  the  smallest 
conditional  uncertainty  value;* 

(b)  Choose  u  Cj  that  cell,  distinot  from  C  ^  ,  having  the  smallest 
value  of  IffjJ,* 


In  cose  of  ties,  an  orbltraiy  oholoe  can  be  made. 
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The  process  can  be  repeated  as  many  times  as  necessary.  The  final 
result  is  a  minimal  or  near-minimal  scanning  path.  In  Chapters  III  and  IV  of 
Report  62-68,  computer  programs  for  carrying  out  the  above  process  are  de 
scribed,  and  the  results  of  actual  trials  are  discussed. 

In  applying  the  technique  described  above,  ibe  following  difficult  ques¬ 
tion  must  be  answered:  How  many  cells  should  be  eliminated  at  each  stage? 
Obviously,  the  answer  to  this  question  depends  upon  the  nature  of  the  charac¬ 
ter  samples  dealt  with.  As  a  general  rule,  however,  it  is  advisable  to  retain 
a  fairly  large  number  of  cells  after  the  first  (conditional  uncertainty)  sieving, 
since  moat  of  the  cells  obtained  will  be  redundant.  Subsequent  correlation 
sieving  will,  therefore,  result  in  elimination  of  a  large  nunober  of  oells, 
leaving  only  a  few  independent,  significant  ones.  If  too  many  cells  are  elimi¬ 
nated  in  the  first  sieving,  there  is  a  posslbillfy  that  an  Insufiloient  number 
of  independent  ones  will  remain  and  satlsfaotoiy  recognition  will  hence  not 
be  achieved. 

It  is  worthwhile  to  consider  the  computational  simplicity  of  the  cor¬ 
relation-sieving  process  in  comparison  to  the  higher-order  conditional  uncer¬ 
tainty  sieve.  Whereas  the  conditional  uncertainty  sieving  becomes  more  dif- 

/ 

floult  by  an  order  of  magnitude  at  each  stage,  the  correlation  process  actually 
becomes  simpler  since  there  are  fewer  cells  to  be  dealt  with  at  each  stage. 
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CHAPTER  II  -  THIN  FILM  STUDIES 


The  research  outlined  in  this  chapter  is  described  in  detail  in 
Technical  Report  No.  62-37  dated  Auj^st  1962. 

A.  STATIC  MAGNETIC  BEHAVIOR  OF  MAGNETOSTRICTIVE 
THIN  FILMS 

Experimental  Studies 

a.  Stress  effects  on  hysteresis  loops 

b.  Stress  effects  on  critical  curve 

c.  Stress  effects  on  domain  wall  configurations 
(Bitter  patterns) 

d.  Film  magnetostriction  measurements 

B.  DYNAMIC  BEHAVIOR  OF  MAGNETOSTRICTIVE  THIN  FILMS 

1.  Theoretical  Model  -  Equation  of  Landau  and  Ltfshitz 

a.  Preliminary  considerations,  assumptions 

b.  Solution  by  analog  computer,  stress  effects  on  switching 

c.  Solution  by  digital  computer,  stress  effects  on  switching 

d.  Comparison  between  analog  and  digital  computer  results 

e.  Computational  flow  charts  for  integration  and  plotting  of 
film  dynamic  response 

f.  Effects  of  parameters  a,  (3,6, on  film  switching  response 

1.  Film  Switching  Response 

non -destructive  pulsing 
destructive  pulsing 

2.  Inverse  Peaking  Time  vs.  Switching  Field 

3.  Ferromagnetic  Resonant  Frequency  vs.  Switching  Field 

4.  Dynamic  Switching  Threshold  vs.  Perpendicular  Field 

2.  Instrumentation  for  the  Study  of  Stress  Effects  on  Thin  Film  Switching 

a.  Description  of  instrumentation 

b.  Operation  and  design  considerations 

1.  Strip  Line  Design 

2.  Estimation  of  Strip  Line  Available  Magnetic  Field 

3.  Estimation  of  Film  Pickup  Voltage 

4.  Mercury  Relays,  Coaxial  Cables,  Connectors 

C.  PIEZOELECTRIC  CRYSTALS  AND  PIEZOELECTRIC  MAGNETO¬ 
STRICTIVE  COUPLING 

1.  Characteristics  of  Piezoelectric  and  Electrostrictive  Crystals.  Table 
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2.  Theoretical  Models  of  Piezoelectric  Behavior 

a.  General  methods  of  analysis,  piezoelectric  equations  of  state, 
equation  of  motion,  magnitude  of  piezoelectric  constant  d. 

b.  Electrostrictive  bar,  longitudinal  mode  (DC  strains).  See 
technical  report  No.  60 -82  dated  Nov.  1960,  pp,IV-9,  10, 

c.  Electrostrictive  bar,  thickness  mode  (DC  and  transient 
strains).  See  technical  report  No.  60  -82  dated  Nov.  1960, 
pp.  IV-9,  10  and  IV -33,  50  (Appendix  B). 

d.  Electrostrictive  bar,  bending  mode,  bimorph  (DC  strains). 

e.  Electrostrictive  disc,  radial  mode,  (DC  strains). 

f.  Electrostrictive  thin  wall  closed  cylinder,  radial  mode 
(DC  strains). 

g.  Electrostrictive  thin  walled  split  cylinder,  flexural  mode, 

(DC  strains). 

h.  Piezoelectric  ADP  plate  X-cut,  quartz  plate  Y -cut  (DC  strains). 

3.  Experimental  Studies  on  Electrostrictive  Ceramics  (DC  strains) 

a.  Electrostrictive  disc 

b.  Electrostrictive  cylinder 

c.  Electrostrictive  split  cylinder 

d.  Experimental  determination  of  coupling  coefficient. 

4.  Piezoelectric  (electrostrictive)- Magnetostrtctive  Coupling. 

a.  Theoretical  model  of  static  behavior.  Piezoelectric  unieocial 
and  isotropic  stress. 

b.  Experimental  studies.  Isotropic  stress. 

APPENDICES 

I 

Computer  Programs  and  Experiments  . 

1.  Program  for  computation  of  hysteresis  loops. 

2.  Program  for  computation  and  plotting  of  static  critical 
switching  curves. 

3.  Description  of  substrate  bending  jig  for  bitter  pattern  studies. 

4.  Description  of  device  for  bending  thin  substrates  in  B-H 
loop  tester. 

5.  Program  for  solution  of  Landau -Lifshitz  equation  and 
computer  curve  plotting. 

6.  Program  for  computation  and  plotting  of  "inverse  peaking 
time"  vs  switching  field  amplitude. 

7.  Program  for  computation  and  plotting  of  ferromagnetic 
oscillation  frequency  vs.  switching  field  amplitude. 

8.  Program  for  computation  and  plotting  of  dynamic  critical 
switching  curve. 

9.  Design  and  construction  of  probe  to  deterihine  the  direction 
of  the  earth' s  magnetic  field. 

10.  Construction  of  system  for  production  of  thin  films  by  vacuum 
evaporation.  Construction  of  auxiliary  instrumentation  for 
magnetic  thin  film  preparation.  ' 
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CHAPTER  III  -  OPTIMIZATION  STUDIES 


During  the  period  of  June  .1961  to  May  1962,  the  emphasis  was  placed 
on  individual  optimization  problems  which  can  be  studied  analytically  either 
exactly  or  by  a  sequence  of  approximations  which  can  be  studied  analytically. 

Admittedly,  classes  of  optimization  problems  that  can  be  discussed 
this  way  is  limited,  however,  it  is  felt  that  any  additional  light  one  could  shed 
by  analytical  means  is  of  help. 

Technical  reports  and  papers  written  during  this  period  reflect  this 
attitude. 

For  details,  readers  are  referred  to: 

Tech.  Report  61-62,  also  to  appear  in  Automattca. 

Tech.  Report  61-66,  also  AIEE  Transactions. 

Applications  and  Industry 
pp.  125-127,  July.  1962. 

Tech.  Report  61-82,  also  to  appear  in  J.  of  Math.  Analysis  and 

Applications. 

The  above  reports  take  problems  from  the  area  of  control  systems 
optimization. 

An  optimization  study  Involving  a  probabilistic  study  of  relative  ef¬ 
ficiency  of  various  Variable  Structure  Computers  in  carrying  out  Aitken- 
Neville  interpolation  procedure,  was  undertaken  during  this  period  and  con¬ 
tinued  to  next  year. 
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